• Best term for a pointer which is null/nil/none etc

    From Bart@21:1/5 to James Harris on Sun Aug 29 14:58:41 2021
    On 29/08/2021 14:28, James Harris wrote:

    That won't tell me the sizes or offsets. For that I need a different
    program that applies sizeof() and offsetof() to each member, and
    sizeof(S3).

    However, that still won't tell me the actual /types/ of the fields. As
    will as the width, if I need to access the individual fields, I will
    need to know whether it's a signed int, unsigned int, or float.

    That will need a third program!

    But I guess all this won't cut any ice, since no matter how much of a
    dog's dinner any C header file is, you are never going to admit that
    might be anything wrong with it that, because there are always going
    to be a few dozen more loops to jump through to extract the info that
    you need.

    You summed up the steps very well. What I don't get is why you don't
    have a program to carry out those steps! If a program runs ten steps or twelve what does it matter? It's a program. It's supposed to do as many
    steps as necessary.

    If you were carrying out the steps manually I could understand it. But
    that would be a nightmare. Please tell me you are not carrying out any
    of the steps manually!!! All steps should be doable by a program which
    will run on each intended target in about 0.01 seconds, shouldn't they?

    You're right. It's such a simple task, that there must be dozens of
    existing programs that will do that job already: convert a C header file
    into a more universal, 'flattened', format of API more suited for cross-language use (and even simpler to machine-read than the original C!).

    And I guess they've already been applied to popular libraries to provide language-neutral versions of those APIs, complete with all the named
    enums, and converted the macros needed to be able to use the library.

    Except I can't find anything like that.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Sun Aug 29 15:38:27 2021
    On 29/08/2021 15:19, Dmitry A. Kazakov wrote:
    On 2021-08-29 15:58, Bart wrote:

    And I guess they've already been applied to popular libraries to
    provide language-neutral versions of those APIs, complete with all the
    named enums, and converted the macros needed to be able to use the
    library.

    Except I can't find anything like that.

    Really?

    1. It is even integrated in GCC. See the

       -fdump-ada-spec

    switch.

    2. c2ada:

       http://c2ada.sourceforge.net/c2ada.html


    No, this is not what James had in mind. Which was writing little C
    scripts that applied -E to preprocess code, and looking for specific
    struct definitions, I think using grep or something.

    I haven't come across the above, but it looks like what I already have
    in my C compiler (see my last post).

    It also looks like like they've come across the same problems, eg:

    "Using C2Ada is a way to lessen the work in translating C headers into
    Ada, to produce a binding, and in translating whole C programs into Ada, producing a translation. C2Ada can do about 80% to 90% of the work automatically but it still takes some manual work to do the last 10% or
    20%."

    And if I apply that gcc option to my raylib example, I get output like this:

    -- unsupported macro: LIGHTGRAY CLITERAL( 200, 200, 200, 255 )
    -- unsupported macro: GRAY CLITERAL( 130, 130, 130, 255 )
    -- unsupported macro: DARKGRAY CLITERAL( 80, 80, 80, 255 )
    -- unsupported macro: YELLOW CLITERAL( 253, 249, 0, 255 )

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Sun Aug 29 16:19:43 2021
    On 2021-08-29 15:58, Bart wrote:

    And I guess they've already been applied to popular libraries to provide language-neutral versions of those APIs, complete with all the named
    enums, and converted the macros needed to be able to use the library.

    Except I can't find anything like that.

    Really?

    1. It is even integrated in GCC. See the

    -fdump-ada-spec

    switch.

    2. c2ada:

    http://c2ada.sourceforge.net/c2ada.html

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Sun Aug 29 16:54:25 2021
    On 2021-08-29 16:38, Bart wrote:

    And if I apply that gcc option to my raylib example, I get output like
    this:

       --  unsupported macro: LIGHTGRAY CLITERAL( 200, 200, 200, 255 )
       --  unsupported macro: GRAY CLITERAL( 130, 130, 130, 255 )
       --  unsupported macro: DARKGRAY CLITERAL( 80, 80, 80, 255 )
       --  unsupported macro: YELLOW CLITERAL( 253, 249, 0, 255 )

    Macros is what you usually must handle manually. No API should use them
    anyway. You cannot put a macro in a shared library.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Bart on Sun Aug 29 15:27:40 2021
    On 29/08/2021 14:58, Bart wrote:
    On 29/08/2021 14:28, James Harris wrote:

    That won't tell me the sizes or offsets. For that I need a different
    program that applies sizeof() and offsetof() to each member, and
    sizeof(S3).

    However, that still won't tell me the actual /types/ of the fields.
    As will as the width, if I need to access the individual fields, I
    will need to know whether it's a signed int, unsigned int, or float.

    That will need a third program!

    But I guess all this won't cut any ice, since no matter how much of a
    dog's dinner any C header file is, you are never going to admit that
    might be anything wrong with it that, because there are always going
    to be a few dozen more loops to jump through to extract the info that
    you need.

    You summed up the steps very well. What I don't get is why you don't
    have a program to carry out those steps! If a program runs ten steps
    or twelve what does it matter? It's a program. It's supposed to do as
    many steps as necessary.

    If you were carrying out the steps manually I could understand it. But
    that would be a nightmare. Please tell me you are not carrying out any
    of the steps manually!!! All steps should be doable by a program which
    will run on each intended target in about 0.01 seconds, shouldn't they?

    You're right. It's such a simple task, that there must be dozens of
    existing programs that will do that job already: convert a C header file
    into a more universal, 'flattened', format of API more suited for cross-language use (and even simpler to machine-read than the original C!).

    And I guess they've already been applied to popular libraries to provide language-neutral versions of those APIs, complete with all the named
    enums, and converted the macros needed to be able to use the library.

    Except I can't find anything like that.

    BTW, I have tried to create my own version of such a tool, which
    actually is not as simple as you are trying to make up. Or maybe I'm
    just too thick to be able to do it.

    It's built as an extension to a C compiler, but it is MY C compiler, so
    doesn't use any external tools. That means it's limited to the
    capabilities of that compiler, which only supports a C subset.

    Here's what happens when I apply it to this C header:

    https://github.com/sal55/langs/blob/master/raylib.h

    This is actually quite a decent, well-written API with no scary types
    that you need to go on multiple hunting expeditions to track down, and
    it's self-contained in one file.

    Yet the conversion is a long way from being complete. This is the output
    from my C compiler on a program '#include "raylib.h"' and with option -mheaders:

    https://github.com/sal55/langs/blob/master/raylib.m

    (The 'c' in 'importdll c' is the name of the C module; this would be
    changed to 'importdll raylib')

    There many minor issues, such as avoiding clashes of names here with
    reserved words in my language, plus clashes to do with this syntax being case-insensitive. Also the struct alignment needs dealing with (but I
    have a new attribute to take of that, not yet applied here).

    However look at the macros at the end. These expand to /C source code/.

    I don't yet have a tool to convert arbitrary C executable source code
    into my language!

    Here there are 40 macros that may need to be attended to manually; in
    GTK there are 3000. In SDL, which I'm much more likely to use, there are
    400.

    However, it could be a lot worse: I may not have had this compiler to
    work with at all. But then I suppose I could just fall back to running
    those really simple scripts of yours!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Sun Aug 29 16:50:53 2021
    On 29/08/2021 13:47, James Harris wrote:
    On 24/08/2021 13:55, David Brown wrote:
    On 24/08/2021 09:27, James Harris wrote:
    On 23/08/2021 10:55, David Brown wrote:
    On 23/08/2021 11:04, James Harris wrote:
    On 22/08/2021 22:46, David Brown wrote:

    ...

    I was talking about what can be done (by programming) rather than about
    what's supported by all extant OSes. For sure, an OS can impose a limit
    on file sizes but I was arguing that it doesn't have to and, frankly,
    that it shouldn't.

    Unless you think that an OS should use arbitrary precision integers for
    handling file sizes and offsets, then you are wrong.

    There are /always/ limits.  There is /always/ a balance between having
    limits that are so high that they won't be a bottleneck, and having
    types that can be handled quickly and efficiently without a waste of
    run-time or data space.

    The limit of a file's size would naturally be defined by the filesystem
    on which it was stored or on which it was being written. Such a value
    would be known by, and a property of, the FS driver.


    "Proof by repetitive assertion" is not convincing.



    With big modern processors, 64-bit sizes here are efficient, and files
    are not going to hit that level in the near future.  (There are files
    sytems in use that approach 2 ^ 64 bytes in size, but not individual
    files.)

    Don't forget that files can have holes. So one does not need to store
    (or even have capacity for) 2^64 bytes in order for a file's max offset
    to be 2^64 - 1.


    That is true - and an argument against claiming that the OS will not
    impose limits. Any normal (PC, server, etc.) OS today will have 64-bit
    file sizes. That might not be enough for specialised use in the future.

    But more importantly, there's no need to prevent a small system from processing a big file.

    Of course there is - it's called "efficiency". You don't make every
    real task on a small system slower in order to support file sizes that
    will never be used with the system.



    On small systems, you use 32-bit for efficiency, and the same applies to
    older big systems.  (You have to get very old and very small to find
    types smaller than 32-bit used for file sizes.)

    I have in mind even smaller systems!


    What systems use file sizes that are smaller than "types smaller that
    32-bit" ?


    The OS /always/ imposes a limit, even if that limit is high.

    No OS should do that. There's no need.


    Again - efficiency. When all files that will ever be used with a given
    system will be far smaller than N, what is the point in making
    everything work vastly slower to support arbitrary sized integers?
    There are good reasons why OS's are written in languages like C, Ada,
    Rust or even assembly, rather than Python.



    As I said, the max file size is naturally a property of the formatted
    filesystem. That size would be set in the FS driver and made known to
    the outside world. An OS could use the driver's published size just as
    well and in the same way as I was suggesting that an application could
    use the driver's published size.


    The OS stands between the application and the filesystem.  Any file
    operation involves the application, the OS, and the filesystem.  The
    biggest file that can be handled is the minimum of the limits of all
    three parts.

    I have to disagree. All three parts can use the size determined by the filesystem.


    And how is that supposed to work, exactly?

    When the application wants to know the size of a file, it is going to
    call an OS function such as "get_file_size_for_name(filename)". The OS
    is going to take that filename, combine it with path information, and
    figure out what file system it is on. Maybe it finds an inode number
    for it. And then it call's the interface function implemented by the
    plugin for the filesystem, "get_file_size_for_inode(filesystem_handle,
    inode)".

    I guess in C, "filesystem_handle" will be passed as a void* pointer so
    that the plugin can see exactly which filesystem it is using.

    What do you suggest for the types for the return value of these functions?


    An OS written 'properly' should, IMO, be able to run indefinitely and
    should permit the addition of new filesystems which weren't even devised >>> when the OS was started. Again, the max file size would need to come
    from the FS driver - which would be loadable and unloadable.


    You have a strange idea about what is "properly written" here.  For the
    vast majority of OS's that are in use today, having run-time pluggable
    file systems would be an insane idea.  OS's are not just limited to *nix
    and Windows.

    Why would having loadable filesystem drivers be "an insane idea"?


    Most OS's don't have filesystems at all. And if they /do/ have one, it
    will usually be dedicated. Remember, the vast majority of OS's in use
    today are almost certainly unknown to you - they are not PC systems, or
    even mobile phone systems, but embedded device systems. Supporting
    pluggable filesystems in your smart lightbulb, or car engine controller,
    or bluetooth-connected electric toothbrush /is/ insane.


    I am not suggesting anything strange; AISI this is basic engineering,
    nothing more.


    /Appropriate/ levels of abstraction and flexibility is basic
    engineering.  /Appropriate/ limits and appropriate sizes for data is
    basic engineering.  Inappropriate generalisations and extrapolations
    is not.


    Again, I have to disagree. The question is: What defines how large a
    file's offset can be?

    The answer is just as simple: Each filesystem has its own range of max
    sizes.


    Your concept of "basic engineering" is severely lacking here.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Sun Aug 29 16:03:37 2021
    On 29/08/2021 15:54, Dmitry A. Kazakov wrote:
    On 2021-08-29 16:38, Bart wrote:

    And if I apply that gcc option to my raylib example, I get output like
    this:

        --  unsupported macro: LIGHTGRAY CLITERAL( 200, 200, 200, 255 )
        --  unsupported macro: GRAY CLITERAL( 130, 130, 130, 255 )
        --  unsupported macro: DARKGRAY CLITERAL( 80, 80, 80, 255 )
        --  unsupported macro: YELLOW CLITERAL( 253, 249, 0, 255 )

    Macros is what you usually must handle manually. No API should use them anyway. You cannot put a macro in a shared library.


    A lot of 'functions' exported by APIs (and also documented as such) are
    really macros. Either which perform the task, or which map to other
    functions.

    Another aspect of this translation is that when you run the tool, it
    will do a specific 'rendering' of the header, which may take into
    account the compiler used, whether it uses -m32 or -m64, compiler
    options where they affect the results, and things like -D macros.

    Sometimes, the C header may depend on, or may itself be created from,
    some synthesised code resulting from a configuration process.

    You have to be aware of exactly which version is being translated.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Sun Aug 29 17:38:13 2021
    On 2021-08-29 17:03, Bart wrote:
    On 29/08/2021 15:54, Dmitry A. Kazakov wrote:
    On 2021-08-29 16:38, Bart wrote:

    And if I apply that gcc option to my raylib example, I get output
    like this:

        --  unsupported macro: LIGHTGRAY CLITERAL( 200, 200, 200, 255 )
        --  unsupported macro: GRAY CLITERAL( 130, 130, 130, 255 )
        --  unsupported macro: DARKGRAY CLITERAL( 80, 80, 80, 255 )
        --  unsupported macro: YELLOW CLITERAL( 253, 249, 0, 255 )

    Macros is what you usually must handle manually. No API should use
    them anyway. You cannot put a macro in a shared library.

    A lot of 'functions' exported by APIs (and also documented as such) are really macros. Either which perform the task, or which map to other functions.

    They are duplicated by proper functions. As I said, macros have no place
    in API that is supposed to be used outside C.

    Another aspect of this translation is that when you run the tool, it
    will do a specific 'rendering' of the header, which may take into
    account the compiler used, whether it uses -m32 or -m64, compiler
    options where they affect the results, and things like -D macros.

    Right, which is why manual translation might be necessary. Again, good
    designed C API do not overuse #ifdef stuff. If something must be made
    dependent on switches, it should be done in a separate dedicated header
    file with varying types typedef-ed, yes, exactly the things you dislike:
    offs_t etc.

    Many API had started as a mess were later pruned to be more friendly at
    huge compatibility cost. E.g. OpenSSL from 1.0.0 to 1.0.1.

    C libraries are infamous for frustrating users with incompatible
    changes. The same GTK caused a small user rebellion when switched from
    2.x to 3.x. Many applications refused to migrate.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Sun Aug 29 18:21:36 2021
    On 29/08/2021 15:50, David Brown wrote:
    On 29/08/2021 13:47, James Harris wrote:
    On 24/08/2021 13:55, David Brown wrote:
    On 24/08/2021 09:27, James Harris wrote:
    On 23/08/2021 10:55, David Brown wrote:
    On 23/08/2021 11:04, James Harris wrote:
    On 22/08/2021 22:46, David Brown wrote:

    ...

    The limit of a file's size would naturally be defined by the filesystem
    on which it was stored or on which it was being written. Such a value
    would be known by, and a property of, the FS driver.


    "Proof by repetitive assertion" is not convincing.

    There's nothing to prove. It is simply factual (and well known) that
    different filesystems have different maximum file sizes. FAT12 has
    different limits from FAT32, for example. Ergo, the maximum permitted
    file size /is/ a natural property of the formatted filesystem. I guess
    that's repetitive again but I cannot imagine what you think would need
    to be added to that to establish the point.

    In fact, someone could release a new filesystem tomorrow which had
    higher limits than those supported by a certain OS today. Under what you
    have proposed the OS would need to be altered and recompiled to match. Therefore the max file size is not naturally a property of the OS.




    With big modern processors, 64-bit sizes here are efficient, and files
    are not going to hit that level in the near future.  (There are files
    sytems in use that approach 2 ^ 64 bytes in size, but not individual
    files.)

    Don't forget that files can have holes. So one does not need to store
    (or even have capacity for) 2^64 bytes in order for a file's max offset
    to be 2^64 - 1.


    That is true - and an argument against claiming that the OS will not
    impose limits. Any normal (PC, server, etc.) OS today will have 64-bit
    file sizes.

    Some will.

    That might not be enough for specialised use in the future.

    Indeed.


    But more importantly, there's no need to prevent a small system from
    processing a big file.

    Of course there is - it's called "efficiency". You don't make every
    real task on a small system slower in order to support file sizes that
    will never be used with the system.

    Oh? How long do you think it would take to, say, add an offset to a
    multiword integer and how long do you think it would take for a device
    driver and io system to respond to a request to read or write the sector
    at that offset?

    Beware of premature optimisation.





    On small systems, you use 32-bit for efficiency, and the same applies to >>> older big systems.  (You have to get very old and very small to find
    types smaller than 32-bit used for file sizes.)

    I have in mind even smaller systems!


    What systems use file sizes that are smaller than "types smaller that
    32-bit" ?

    I thought you were the microcontroller man!



    The OS /always/ imposes a limit, even if that limit is high.

    No OS should do that. There's no need.


    Again - efficiency. When all files that will ever be used with a given system will be far smaller than N, what is the point in making
    everything work vastly slower to support arbitrary sized integers?
    There are good reasons why OS's are written in languages like C, Ada,
    Rust or even assembly, rather than Python.

    I don't think you understand the proposal but see below.




    As I said, the max file size is naturally a property of the formatted
    filesystem. That size would be set in the FS driver and made known to
    the outside world. An OS could use the driver's published size just as >>>> well and in the same way as I was suggesting that an application could >>>> use the driver's published size.


    The OS stands between the application and the filesystem.  Any file
    operation involves the application, the OS, and the filesystem.  The
    biggest file that can be handled is the minimum of the limits of all
    three parts.

    I have to disagree. All three parts can use the size determined by the
    filesystem.


    And how is that supposed to work, exactly?

    I'll do my best to explain.


    When the application wants to know the size of a file, it is going to
    call an OS function such as "get_file_size_for_name(filename)". The OS
    is going to take that filename, combine it with path information, and
    figure out what file system it is on. Maybe it finds an inode number
    for it. And then it call's the interface function implemented by the
    plugin for the filesystem, "get_file_size_for_inode(filesystem_handle, inode)".

    OK. The filesystem driver (which would do most of the manipulation of
    offsets) could work with file offsets as integers of a size which was
    known then the driver was compiled. E.g. the driver might support 48-bit offsets. If compiled for a 64-bit CPU it could manipulate them as ints.
    If compiled for a 16-bit CPU it could manipulate them as three
    successive ints.


    I guess in C, "filesystem_handle" will be passed as a void* pointer so
    that the plugin can see exactly which filesystem it is using.

    What do you suggest for the types for the return value of these functions?

    Something Dmitry said makes, I think, this easier to explain. In an OO
    language you could think of the returns from your functions as objects.
    The objects would be correctly sized for the filesystems to which they
    related. They could have different classes but they would all respond polymorphically to the same methods. The class of each would know the
    maximum permitted offset.

    An object holding a FAT12 offset would be at least 12 bits. An object
    holding a FAT32 offset would be at least 32 bits. An object holding a
    ZFS offset would be ... big enough for the ZFS volume to which it related.

    Incidentally, going back to your concerns about performance, it could
    well take longer to dispatch to a seek method than it would to execute
    it!!! What I have suggested is fast, not slow.



    An OS written 'properly' should, IMO, be able to run indefinitely and
    should permit the addition of new filesystems which weren't even devised >>>> when the OS was started. Again, the max file size would need to come
    from the FS driver - which would be loadable and unloadable.


    You have a strange idea about what is "properly written" here.  For the >>> vast majority of OS's that are in use today, having run-time pluggable
    file systems would be an insane idea.  OS's are not just limited to *nix >>> and Windows.

    Why would having loadable filesystem drivers be "an insane idea"?


    Most OS's don't have filesystems at all. And if they /do/ have one, it
    will usually be dedicated. Remember, the vast majority of OS's in use
    today are almost certainly unknown to you - they are not PC systems, or
    even mobile phone systems, but embedded device systems. Supporting
    pluggable filesystems in your smart lightbulb, or car engine controller,
    or bluetooth-connected electric toothbrush /is/ insane.

    That does not answer the question. Toothbrush OSes do not need frame
    buffers - but that does not make frame buffers insane.



    I am not suggesting anything strange; AISI this is basic engineering,
    nothing more.


    /Appropriate/ levels of abstraction and flexibility is basic
    engineering.  /Appropriate/ limits and appropriate sizes for data is
    basic engineering.  Inappropriate generalisations and extrapolations
    is not.


    Again, I have to disagree. The question is: What defines how large a
    file's offset can be?

    The answer is just as simple: Each filesystem has its own range of max
    sizes.


    Your concept of "basic engineering" is severely lacking here.

    Oh? What, specifically, is lacking?


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Sun Aug 29 19:36:14 2021
    On 29/08/2021 19:24, Bart wrote:
    On 29/08/2021 18:21, James Harris wrote:
    On 29/08/2021 15:50, David Brown wrote:

    "Proof by repetitive assertion" is not convincing.

    There's nothing to prove. It is simply factual (and well known) that
    different filesystems have different maximum file sizes. FAT12 has
    different limits from FAT32, for example. Ergo, the maximum permitted
    file size /is/ a natural property of the formatted filesystem. I guess
    that's repetitive again but I cannot imagine what you think would need
    to be added to that to establish the point.

    In fact, someone could release a new filesystem tomorrow which had
    higher limits than those supported by a certain OS today. Under what
    you have proposed the OS would need to be altered and recompiled to
    match. Therefore the max file size is not naturally a property of the OS.

    What do you mean by a "file system"? Is it something that /has/ to be
    dealt with via an OS, so the OS's limitations matter, or could it be
    accessed via an API independently of OS?

    A filesystem is basically the stored data structure which is used to
    organise a volume so it can appear as a set of files, folders etc.

    https://en.wikipedia.org/wiki/File_system

    An OS isn't necessary at all for accessing a filesystem but if you want multiple tasks to be able to access the filesystem then an OS can
    coordinate them and protect one task from another. Where an OS is
    present it will normally require that all filesystem accesses go via the
    OS rather than direct to the FS driver.






    With big modern processors, 64-bit sizes here are efficient, and files >>>>> are not going to hit that level in the near future.  (There are files >>>>> sytems in use that approach 2 ^ 64 bytes in size, but not individual >>>>> files.)

    Don't forget that files can have holes.

    (If that was ever the case, then I /have/ forgotten!)

    https://en.wikipedia.org/wiki/Sparse_file


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Sun Aug 29 19:24:22 2021
    On 29/08/2021 18:21, James Harris wrote:
    On 29/08/2021 15:50, David Brown wrote:

    "Proof by repetitive assertion" is not convincing.

    There's nothing to prove. It is simply factual (and well known) that different filesystems have different maximum file sizes. FAT12 has
    different limits from FAT32, for example. Ergo, the maximum permitted
    file size /is/ a natural property of the formatted filesystem. I guess
    that's repetitive again but I cannot imagine what you think would need
    to be added to that to establish the point.

    In fact, someone could release a new filesystem tomorrow which had
    higher limits than those supported by a certain OS today. Under what you
    have proposed the OS would need to be altered and recompiled to match. Therefore the max file size is not naturally a property of the OS.

    What do you mean by a "file system"? Is it something that /has/ to be
    dealt with via an OS, so the OS's limitations matter, or could it be
    accessed via an API independently of OS?





    With big modern processors, 64-bit sizes here are efficient, and files >>>> are not going to hit that level in the near future.  (There are files >>>> sytems in use that approach 2 ^ 64 bytes in size, but not individual
    files.)

    Don't forget that files can have holes.

    (If that was ever the case, then I /have/ forgotten!)

    That might not be enough for specialised use in the future.

    Indeed.


    But more importantly, there's no need to prevent a small system from
    processing a big file.

    Of course there is - it's called "efficiency".  You don't make every
    real task on a small system slower in order to support file sizes that
    will never be used with the system.

    Oh? How long do you think it would take to, say, add an offset to a
    multiword integer and how long do you think it would take for a device
    driver and io system to respond to a request to read or write the sector
    at that offset?

    It sounds more work than just deciding to use a fixed 64-bit size. This
    is after all for file operations that are normally magnitudes slower
    than memory accesses.

    And I really wouldn't worry about sizes bigger than 64 bits, if even
    major, corporate OS developers are not worried about them.

    I only found out last week that some of my file routines were capped at
    32 bits, so a full 64 bits will allow me to work with files 4 billion
    times bigger than the largest file I can use at present.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Sun Aug 29 19:47:57 2021
    On 29/08/2021 15:38, Bart wrote:
    On 29/08/2021 15:19, Dmitry A. Kazakov wrote:
    On 2021-08-29 15:58, Bart wrote:

    And I guess they've already been applied to popular libraries to
    provide language-neutral versions of those APIs, complete with all
    the named enums, and converted the macros needed to be able to use
    the library.

    Except I can't find anything like that.

    Really?

    1. It is even integrated in GCC. See the

        -fdump-ada-spec

    switch.

    2. c2ada:

        http://c2ada.sourceforge.net/c2ada.html


    No, this is not what James had in mind. Which was writing little C
    scripts that applied -E to preprocess code, and looking for specific
    struct definitions, I think using grep or something.

    Not quite. You identified more than one problem. One was to determine a target's struct layouts (e.g. for struct stat). For that, I suggested
    that on each target which has layouts defined in C structs you run the C preprocessor, parse its output and add the info you need into a
    configuration file for the target. Then you would be able to use the configuration file to build any programs in your own language for the
    same target - including getting the struct stat field offsets and types
    to match.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Sun Aug 29 20:14:24 2021
    On 29/08/2021 11:31, Dmitry A. Kazakov wrote:
    On 2021-08-29 11:51, James Harris wrote:
    On 29/08/2021 09:38, Dmitry A. Kazakov wrote:
    On 2021-08-29 10:16, James Harris wrote:
    On 24/08/2021 09:34, Dmitry A. Kazakov wrote:

    ...

    Your array is allocated in the user space. Do you understand that?

    There could be file offsets in both user and kernel space. Why? I
    cannot see what you are driving at.

    Passing anything from the user-space is extremely expensive.

    Well, let's assume that's true. If a call such as

    seek(file, offset)

    passes the offset to the kernel the additional cost of 'offset' being an
    object rather than an integer will be swamped by that of the call,
    itself. Good point, Dmitry! ;-)

    ...

    Yes, the FS's offset would be a scalar.

    Then why passing it by reference?

    I'd pass the offset by reference because its size would not be known
    at compile time.

    See, it is neither scalar nor statically bound = dynamic.

    You want to argue about semantics, again?!

    ...

    It would be /implemented/ as an array but it would be /semantically/
    an object. You could say that a 4-byte integer was implemented as an
    array of four bytes, if you wanted, and it might need to be
    implemented that way on a small CPU but it would still be semantically
    an integer.

    Again, scalar is not array. Neither implements another, they are already implementations.


    For what we have been discussing an array of three integers is no
    different from a record of three integers; nor is either different from
    an object which holds three integers. They are all semantically ONE
    object which holds THREE integers. You can call records and objects
    arrays rather than scalars if you want to but they would be used and
    passed around as scalars.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Sun Aug 29 20:02:08 2021
    On 29/08/2021 15:27, Bart wrote:
    On 29/08/2021 14:58, Bart wrote:
    On 29/08/2021 14:28, James Harris wrote:

    That won't tell me the sizes or offsets. For that I need a different
    program that applies sizeof() and offsetof() to each member, and
    sizeof(S3).

    However, that still won't tell me the actual /types/ of the fields.
    As will as the width, if I need to access the individual fields, I
    will need to know whether it's a signed int, unsigned int, or float.

    ...

    If you were carrying out the steps manually I could understand it.
    But that would be a nightmare. Please tell me you are not carrying
    out any of the steps manually!!! All steps should be doable by a
    program which will run on each intended target in about 0.01 seconds,
    shouldn't they?

    ...

    BTW, I have tried to create my own version of such a tool, which
    actually is not as simple as you are trying to make up. Or maybe I'm
    just too thick to be able to do it.

    It's built as an extension to a C compiler, but it is MY C compiler, so doesn't use any external tools. That means it's limited to the
    capabilities of that compiler, which only supports a C subset.

    Here's what happens when I apply it to this C header:

      https://github.com/sal55/langs/blob/master/raylib.h

    You're threatening to bifurcate the discussion again! :-(

    On the topic we were discussing if you have a C parser why not use it
    (along with a C preprocessor) to extract the info you need - such a type definitions and structure offsets - from each target environment's C
    header files? That's what I thought you wanted to do before.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Sun Aug 29 21:32:13 2021
    On 2021-08-29 21:14, James Harris wrote:
    On 29/08/2021 11:31, Dmitry A. Kazakov wrote:

    Again, scalar is not array. Neither implements another, they are
    already implementations.

    For what we have been discussing an array of three integers is no
    different from a record of three integers;

    Of course they are different. You seems confuse type with its machine representation. Many types may have similar representation, that does
    not make them same.

    nor is either different from
    an object which holds three integers. They are all semantically ONE
    object which holds THREE integers.

    They are not, because you stated that it is a composite type.

    You can call records and objects
    arrays rather than scalars if you want to but they would be used and
    passed around as scalars.

    I do not know what does this mean.

    You can pass a value either by value or by reference. By-value passing
    could be done over a machine register.

    Scalar is a type property, opposite to composite type.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Sun Aug 29 20:17:35 2021
    On 29/08/2021 20:02, James Harris wrote:
    On 29/08/2021 15:27, Bart wrote:
    On 29/08/2021 14:58, Bart wrote:
    On 29/08/2021 14:28, James Harris wrote:

    That won't tell me the sizes or offsets. For that I need a
    different program that applies sizeof() and offsetof() to each
    member, and sizeof(S3).

    However, that still won't tell me the actual /types/ of the fields.
    As will as the width, if I need to access the individual fields, I
    will need to know whether it's a signed int, unsigned int, or float.

    ...

    If you were carrying out the steps manually I could understand it.
    But that would be a nightmare. Please tell me you are not carrying
    out any of the steps manually!!! All steps should be doable by a
    program which will run on each intended target in about 0.01
    seconds, shouldn't they?

    ...

    BTW, I have tried to create my own version of such a tool, which
    actually is not as simple as you are trying to make up. Or maybe I'm
    just too thick to be able to do it.

    It's built as an extension to a C compiler, but it is MY C compiler,
    so doesn't use any external tools. That means it's limited to the
    capabilities of that compiler, which only supports a C subset.

    Here's what happens when I apply it to this C header:

       https://github.com/sal55/langs/blob/master/raylib.h

    You're threatening to bifurcate the discussion again! :-(

    On the topic we were discussing if you have a C parser why not use it
    (along with a C preprocessor) to extract the info you need - such a type definitions and structure offsets - from each target environment's C
    header files? That's what I thought you wanted to do before.



    Did you read the rest of my post? A few lines down I linked to a file
    which was generated by my compiler. But it cannot do a completely
    automatic translation.

    The same problem was encountered by the C to Ada tools that DAK posted.

    And while the C compiler is intended to deal with non-system headers of
    any libraries, very often they will be full of things that cause
    problems. For example, an #if/#elif chain that tests for a certain set
    of compilers, of which mine won't be one.

    As for system headers, it's not practical to use those of other
    compilers (as they will be full of implementation-specific features); I
    have to construct my own. Including the famous 'struct stat', by delving
    deep into that rabbit-hole.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Sun Aug 29 21:17:27 2021
    On 29/08/2021 20:17, Bart wrote:
    On 29/08/2021 20:02, James Harris wrote:

    ...

    On the topic we were discussing if you have a C parser why not use it
    (along with a C preprocessor) to extract the info you need - such a
    type definitions and structure offsets - from each target
    environment's C header files? That's what I thought you wanted to do
    before.



    Did you read the rest of my post? A few lines down I linked to a file
    which was generated by my compiler. But it cannot do a completely
    automatic translation.

    Yes, I did. I looked at the two files you linked and even the definition
    of the macros which did not convert.

    But translating your own sources is a new topic and was NOT what we were talking about.

    ...

    As for system headers, it's not practical to use those of other
    compilers (as they will be full of implementation-specific features);

    That was the point: other environments do NOT have configuration files
    but they often DO have C headers. To determine the configuration for
    those environments you need to get the info from the C headers. And
    that's best done by

    Cpreprocessor < header | bart_parse env.conf

    where bart_parse is your program which parses the output from the C preprocessor and updates env.conf with the required info.


    I
    have to construct my own. Including the famous 'struct stat', by delving
    deep into that rabbit-hole.

    If you mean manually then, no, no, please don't keep doing that! :-(


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Sun Aug 29 22:10:47 2021
    On 29/08/2021 21:17, James Harris wrote:
    On 29/08/2021 20:17, Bart wrote:
    On 29/08/2021 20:02, James Harris wrote:

    ...

    On the topic we were discussing if you have a C parser why not use it
    (along with a C preprocessor) to extract the info you need - such a
    type definitions and structure offsets - from each target
    environment's C header files? That's what I thought you wanted to do
    before.



    Did you read the rest of my post? A few lines down I linked to a file
    which was generated by my compiler. But it cannot do a completely
    automatic translation.

    Yes, I did. I looked at the two files you linked and even the definition
    of the macros which did not convert.

    But translating your own sources is a new topic and was NOT what we were talking about.

    Well, you asked why I didn't use my parser to extract that info, but how
    you think that second file got generated!

    But it was a huge undertaking; it's incomplete; it's buggy; it only
    works on headers that my compiler can process correctly; and it doesn't
    work for system headers, which I need to convert by far more painstaking processess (see below), into headers files that suit my compiler.

    ...

    As for system headers, it's not practical to use those of other
    compilers (as they will be full of implementation-specific features);

    That was the point: other environments do NOT have configuration files
    but they often DO have C headers. To determine the configuration for
    those environments you need to get the info from the C headers. And
    that's best done by

      Cpreprocessor < header | bart_parse env.conf

    where bart_parse is your program which parses the output from the C preprocessor and updates env.conf with the required info.

    For this purpose (creating C system header files to go with your own C compiler), you need to end up with an actual set of header files.

    Which existing compilers do you look at for information? Ones like gcc
    have incredibly elaborate headers, full of compiler-specific built-ins
    and attributes and predefined macros.

    If I take stdarg.h as an example, my version of that has this; it
    defines on type, and 5 macros:

    --------------------------------------------------------------
    /* Header stdarg.h */

    #ifndef $STDARG
    #define $STDARG

    typedef char * va_list;
    #define va_start(ap,v) ap=((va_list)&v+8)
    #define va_arg(ap,t) *(t*)((ap+=8)-8)
    #define va_copy(dest,src) (dest=src)
    #define va_end(ap) ( ap = (va_list)0 )
    #endif
    --------------------------------------------------------------


    The one used by tdm/gcc uses those 3 headers shown below (about 300
    lines); I haven't included the 600-line _mingw_h which has yet more
    includes.

    How do you get from this, to the above? Certainly not by any automatic
    process!



    -----------------------------------------------------------------------------------

    ********************* stdarg.h
    /* Copyright (C) 1989, 1997, 1998, 1999, 2000 Free Software Foundation, Inc.

    This file is part of GCC.

    GCC is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2, or (at your option)
    any later version.

    GCC is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with GCC; see the file COPYING. If not, write to
    the Free Software Foundation, 51 Franklin Street, Fifth Floor,
    Boston, MA 02110-1301, USA. */

    /* As a special exception, if you include this header file into source
    files compiled by GCC, this header file does not by itself cause
    the resulting executable to be covered by the GNU General Public
    License. This exception does not however invalidate any other
    reasons why the executable file might be covered by the GNU General
    Public License. */

    /*
    * ISO C Standard: 7.15 Variable arguments <stdarg.h>
    */

    #if defined(__GNUC__)

    #ifndef _STDARG_H
    #ifndef _ANSI_STDARG_H_
    #ifndef __need___va_list
    #define _STDARG_H
    #define _ANSI_STDARG_H_
    #endif /* not __need___va_list */
    #undef __need___va_list

    /* Define __gnuc_va_list. */

    #ifndef __GNUC_VA_LIST
    #define __GNUC_VA_LIST
    typedef __builtin_va_list __gnuc_va_list;
    #endif

    /* Define the standard macros for the user,
    if this invocation was from the user program. */
    #ifdef _STDARG_H

    #define va_start(v,l) __builtin_va_start(v,l)
    #define va_end(v) __builtin_va_end(v)
    #define va_arg(v,l) __builtin_va_arg(v,l)
    #if !defined(__STRICT_ANSI__) || __STDC_VERSION__ + 0 >= 199900L || defined(__GXX_EXPERIMENTAL_CXX0X__)
    #define va_copy(d,s) __builtin_va_copy(d,s)
    #endif
    #define __va_copy(d,s) __builtin_va_copy(d,s)

    /* Define va_list, if desired, from __gnuc_va_list. */
    /* We deliberately do not define va_list when called from
    stdio.h, because ANSI C says that stdio.h is not supposed to define
    va_list. stdio.h needs to have access to that data type,
    but must not use that name. It should use the name __gnuc_va_list,
    which is safe because it is reserved for the implementation. */

    #ifdef _HIDDEN_VA_LIST /* On OSF1, this means varargs.h is
    "half-loaded". */
    #undef _VA_LIST
    #endif

    #ifdef _BSD_VA_LIST
    #undef _BSD_VA_LIST
    #endif

    #if defined(__svr4__) || (defined(_SCO_DS) && !defined(__VA_LIST))
    /* SVR4.2 uses _VA_LIST for an internal alias for va_list,
    so we must avoid testing it and setting it here.
    SVR4 uses _VA_LIST as a flag in stdarg.h, but we should
    have no conflict with that. */
    #ifndef _VA_LIST_
    #define _VA_LIST_
    #ifdef __i860__
    #ifndef _VA_LIST
    #define _VA_LIST va_list
    #endif
    #endif /* __i860__ */
    typedef __gnuc_va_list va_list;
    #ifdef _SCO_DS
    #define __VA_LIST
    #endif
    #endif /* _VA_LIST_ */
    #else /* not __svr4__ || _SCO_DS */

    /* The macro _VA_LIST_ is the same thing used by this file in Ultrix.
    But on BSD NET2 we must not test or define or undef it.
    (Note that the comments in NET 2's ansi.h
    are incorrect for _VA_LIST_--see stdio.h!) */
    #if !defined (_VA_LIST_) || defined (__BSD_NET2__) || defined
    (____386BSD____) || defined (__bsdi__) || defined (__sequent__) ||
    defined (__FreeBSD__) || defined(WINNT)
    /* The macro _VA_LIST_DEFINED is used in Windows NT 3.5 */
    #ifndef _VA_LIST_DEFINED
    /* The macro _VA_LIST is used in SCO Unix 3.2. */
    #ifndef _VA_LIST
    /* The macro _VA_LIST_T_H is used in the Bull dpx2 */
    #ifndef _VA_LIST_T_H
    /* The macro __va_list__ is used by BeOS. */
    #ifndef __va_list__
    typedef __gnuc_va_list va_list;
    #endif /* not __va_list__ */
    #endif /* not _VA_LIST_T_H */
    #endif /* not _VA_LIST */
    #endif /* not _VA_LIST_DEFINED */
    #if !(defined (__BSD_NET2__) || defined (____386BSD____) || defined
    (__bsdi__) || defined (__sequent__) || defined (__FreeBSD__))
    #define _VA_LIST_
    #endif
    #ifndef _VA_LIST
    #define _VA_LIST
    #endif
    #ifndef _VA_LIST_DEFINED
    #define _VA_LIST_DEFINED
    #endif
    #ifndef _VA_LIST_T_H
    #define _VA_LIST_T_H
    #endif
    #ifndef __va_list__
    #define __va_list__
    #endif

    #endif /* not _VA_LIST_, except on certain systems */

    #endif /* not __svr4__ */

    #endif /* _STDARG_H */

    #endif /* not _ANSI_STDARG_H_ */
    #endif /* not _STDARG_H */

    #endif /*__GNUC__ */

    /* include mingw stuff */
    #include <_mingw_stdarg.h>


    ********************* _mingw_stdarg.h

    /**
    * This file has no copyright assigned and is placed in the Public Domain.
    * This file is part of the mingw-w64 runtime package.
    * No warranty is given; refer to the file DISCLAIMER.PD within this
    package.
    */

    #ifndef _INC_STDARG
    #define _INC_STDARG

    #ifndef _WIN32
    #error Only Win32 target is supported!
    #endif

    #include <vadefs.h>

    #ifndef va_start
    #define va_start _crt_va_start
    #endif

    #ifndef va_arg
    #define va_arg _crt_va_arg
    #endif

    #ifndef va_end
    #define va_end _crt_va_end
    #endif

    #ifndef __va_copy
    #define __va_copy _crt_va_copy
    #endif

    #if !defined(va_copy) && \
    (!defined(__STRICT_ANSI__) || __STDC_VERSION__ + 0 >= 199900L || defined(__GXX_EXPERIMENTAL_CXX0X__))
    #define va_copy _crt_va_copy
    #endif

    #endif /* not _INC_STDARG */


    ********************* vadefs.h

    /**
    * This file has no copyright assigned and is placed in the Public Domain.
    * This file is part of the mingw-w64 runtime package.
    * No warranty is given; refer to the file DISCLAIMER.PD within this
    package.
    */
    #ifndef _INC_VADEFS
    #define _INC_VADEFS

    #include <_mingw.h>

    #ifndef __WIDL__
    #undef _CRT_PACKING
    #define _CRT_PACKING 8
    #pragma pack(push,_CRT_PACKING)
    #endif

    #ifdef __cplusplus
    extern "C" {
    #endif

    #if defined (__GNUC__)
    #ifndef __GNUC_VA_LIST
    #define __GNUC_VA_LIST
    typedef __builtin_va_list __gnuc_va_list;
    #endif
    #endif /* __GNUC__ */

    #ifndef _VA_LIST_DEFINED /* if stdargs.h didn't define it */
    #define _VA_LIST_DEFINED
    #if defined(__GNUC__)
    typedef __gnuc_va_list va_list;
    #elif defined(_MSC_VER)
    typedef char * va_list;
    #elif !defined(__WIDL__)
    #error VARARGS not implemented for this compiler
    #endif
    #endif /* _VA_LIST_DEFINED */

    #ifdef __cplusplus
    #define _ADDRESSOF(v) (&reinterpret_cast<const char &>(v))
    #else
    #define _ADDRESSOF(v) (&(v))
    #endif

    #if defined (__GNUC__)
    /* Use GCC builtins */

    #define _crt_va_start(v,l) __builtin_va_start(v,l)
    #define _crt_va_arg(v,l) __builtin_va_arg(v,l)
    #define _crt_va_end(v) __builtin_va_end(v)
    #define _crt_va_copy(d,s) __builtin_va_copy(d,s)

    #elif defined(_MSC_VER)
    /* MSVC specific */

    #if defined(_M_IA64)
    #define _VA_ALIGN 8
    #define _SLOTSIZEOF(t) ((sizeof(t) + _VA_ALIGN - 1) & ~(_VA_ALIGN - 1))
    #define _VA_STRUCT_ALIGN 16
    #define _ALIGNOF(ap) ((((ap)+_VA_STRUCT_ALIGN - 1) & ~(_VA_STRUCT_ALIGN
    -1)) - (ap))
    #define _APALIGN(t,ap) (__alignof(t) > 8 ? _ALIGNOF((uintptr_t) ap) : 0)
    #else
    #define _SLOTSIZEOF(t) (sizeof(t))
    #define _APALIGN(t,ap) (__alignof(t))
    #endif

    #if defined(_M_IX86)

    #define _INTSIZEOF(n) ((sizeof(n) + sizeof(int) - 1) & ~(sizeof(int) - 1)) #define _crt_va_start(v,l) ((v) = (va_list)_ADDRESSOF(l) + _INTSIZEOF(l)) #define _crt_va_arg(v,l) (*(l *)(((v) += _INTSIZEOF(l)) - _INTSIZEOF(l)))
    #define _crt_va_end(v) ((v) = (va_list)0)
    #define _crt_va_copy(d,s) ((d) = (s))

    #elif defined(_M_AMD64)

    #define _PTRSIZEOF(n) ((sizeof(n) + sizeof(void*) - 1) & ~(sizeof(void*)
    - 1))
    #define _ISSTRUCT(t) ((sizeof(t) > sizeof(void*)) || (sizeof(t) &
    (sizeof(t) - 1)) != 0)
    #define _crt_va_start(v,l) ((v) = (va_list)_ADDRESSOF(l) + _PTRSIZEOF(l)) #define _crt_va_arg(v,t) _ISSTRUCT(t) ? \
    (**(t**)(((v) += sizeof(void*)) - sizeof(void*))) : \
    ( *(t *)(((v) += sizeof(void*)) - sizeof(void*)))
    #define _crt_va_end(v) ((v) = (va_list)0)
    #define _crt_va_copy(d,s) ((d) = (s))

    #elif defined(_M_IA64)

    #error VARARGS not implemented for IA64

    #else

    #error VARARGS not implemented for this TARGET

    #endif /* cpu ifdefs */

    #endif /* compiler ifdefs */

    #ifdef __cplusplus
    }
    #endif

    #ifndef __WIDL__
    #pragma pack(pop)
    #endif

    #endif /* _INC_VADEFS */

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Sun Aug 29 23:24:30 2021
    On 29/08/2021 19:21, James Harris wrote:
    On 29/08/2021 15:50, David Brown wrote:
    On 29/08/2021 13:47, James Harris wrote:
    On 24/08/2021 13:55, David Brown wrote:
    On 24/08/2021 09:27, James Harris wrote:
    On 23/08/2021 10:55, David Brown wrote:
    On 23/08/2021 11:04, James Harris wrote:
    On 22/08/2021 22:46, David Brown wrote:

    ...

    The limit of a file's size would naturally be defined by the filesystem
    on which it was stored or on which it was being written. Such a value
    would be known by, and a property of, the FS driver.


    "Proof by repetitive assertion" is not convincing.

    There's nothing to prove. It is simply factual (and well known) that different filesystems have different maximum file sizes. FAT12 has
    different limits from FAT32, for example. Ergo, the maximum permitted
    file size /is/ a natural property of the formatted filesystem. I guess
    that's repetitive again but I cannot imagine what you think would need
    to be added to that to establish the point.

    Of course different filesystems have different maximum file sizes - no
    one has disputed that! All that is in dispute is your silly idea that
    the /OS/ does not have limits here.

    The filesystems have limits, the OS has limits, and the application has
    limits. (I'm considering libraries as either part of the OS or part of
    the application here.) The biggest file you can work with is limited by
    the minimum of those three limits.


    In fact, someone could release a new filesystem tomorrow which had
    higher limits than those supported by a certain OS today. Under what you
    have proposed the OS would need to be altered and recompiled to match.

    You make it sound like /I/ invented this limitation, rather than the
    simple fact of it applying to every OS in existence.

    Therefore the max file size is not naturally a property of the OS.

    The every OS is unnatural to you.





    With big modern processors, 64-bit sizes here are efficient, and files >>>> are not going to hit that level in the near future.  (There are files >>>> sytems in use that approach 2 ^ 64 bytes in size, but not individual
    files.)

    Don't forget that files can have holes. So one does not need to store
    (or even have capacity for) 2^64 bytes in order for a file's max offset
    to be 2^64 - 1.


    That is true - and an argument against claiming that the OS will not
    impose limits.  Any normal (PC, server, etc.) OS today will have 64-bit
    file sizes.

    Some will.

    All will, AFAIK. Feel free to give examples to the contrary. (There
    are OS's and OS configurations that will be limited to 32-bit file
    sizes, but these would not be current versions for current PC's or servers.)


    That might not be enough for specialised use in the future.

    Indeed.


    Note that a 2 ^ 64 byte file would fill approximately a million top-size
    modern high density hard drives. One day, perhaps, we'll store that on
    a little "holocube". But it will be a while before such file sizes
    would be practical or useful.


    But more importantly, there's no need to prevent a small system from
    processing a big file.

    Of course there is - it's called "efficiency".  You don't make every
    real task on a small system slower in order to support file sizes that
    will never be used with the system.

    Oh? How long do you think it would take to, say, add an offset to a
    multiword integer and how long do you think it would take for a device
    driver and io system to respond to a request to read or write the sector
    at that offset?


    The types you use for your file sizes and offsets are used all over the
    place in code - in the application, the OS, and the file system, along
    with every library in between. I am not concerned about the efficiency
    when dealing with 2 ^ 64 byte files - for a start, these don't exist,
    and if they did they'd be slow. I am talking about the efficiency of
    dealing with /real/ files, almost all of which fit easily within a 32
    bit size. Reading these into memory does not take long with a modern
    NVMe disk - and far less with the fastest choices (NVDIMM). Writing
    speed doesn't matter at all, because it is cached. But absurdly using unlimited precision integers to handle offsets makes everything bigger
    and slower, as well as vastly more complicated for the code.

    Beware of premature optimisation.


    Using a fixed, large size for file sizes is not /premature/
    optimisation. Using unlimited sizes is an example of ridiculous over-engineering.

    Now, if you had said that limits make sense, but they should be 128-bit integers not 64-bit integers, you might have had a point. It is
    conceivable that humans will have occasional use of files bigger than 2
    ^ 64 bytes one day - but 2 ^ 128 is getting close to the number of atoms
    in the earth (about 2 ^ 166). Or maybe you would prefer 256-bit integer
    sizes, which covers all the atoms in the universe to within a factor of
    a million or so.

    Those are all still limits - hard limits. 64-bit is more than enough
    for far foreseeable future. 128-bit is more than enough for an absurdly extrapolated science-fiction future. I'm sure that with a bit of
    relativity, quantum mechanics and theoretical information theory, you'd
    be able to prove that 256-bit is more than enough for any physically
    realisable file size - by the time you had written the end of the file,
    the expanding universe would have moved the start of the file beyond
    your reach.

    Yet those are not enough for you - you want the OS to support
    /unlimited/ sizes!





    On small systems, you use 32-bit for efficiency, and the same
    applies to
    older big systems.  (You have to get very old and very small to find
    types smaller than 32-bit used for file sizes.)

    I have in mind even smaller systems!


    What systems use file sizes that are smaller than "types smaller that
    32-bit" ?

    I thought you were the microcontroller man!

    I am. What type would you use for file sizes here? When 32-bit is
    bigger than you want, the next smaller standard size is 16-bit.
    Certainly there have been OS's that used 16-bit integers for file sizes
    - I would expect it to have been fairly common in the days of 8-bit home computers (C64, ZX Spectrum, etc.). I am sure they would also have been
    useful when files came on punched cards. But you want to go even
    smaller. Tell me, what use do you see for a system supporting 8-bit
    file sizes?




    The OS /always/ imposes a limit, even if that limit is high.

    No OS should do that. There's no need.


    Again - efficiency.  When all files that will ever be used with a given
    system will be far smaller than N, what is the point in making
    everything work vastly slower to support arbitrary sized integers?
    There are good reasons why OS's are written in languages like C, Ada,
    Rust or even assembly, rather than Python.

    I don't think you understand the proposal but see below.


    You asked for unlimited sizes. That means arbitrary precision integers.




    As I said, the max file size is naturally a property of the formatted >>>>> filesystem. That size would be set in the FS driver and made known to >>>>> the outside world. An OS could use the driver's published size just as >>>>> well and in the same way as I was suggesting that an application could >>>>> use the driver's published size.


    The OS stands between the application and the filesystem.  Any file
    operation involves the application, the OS, and the filesystem.  The
    biggest file that can be handled is the minimum of the limits of all
    three parts.

    I have to disagree. All three parts can use the size determined by the
    filesystem.


    And how is that supposed to work, exactly?

    I'll do my best to explain.


    When the application wants to know the size of a file, it is going to
    call an OS function such as "get_file_size_for_name(filename)".  The OS
    is going to take that filename, combine it with path information, and
    figure out what file system it is on.  Maybe it finds an inode number
    for it.  And then it call's the interface function implemented by the
    plugin for the filesystem, "get_file_size_for_inode(filesystem_handle,
    inode)".

    OK. The filesystem driver (which would do most of the manipulation of offsets) could work with file offsets as integers of a size which was
    known then the driver was compiled. E.g. the driver might support 48-bit offsets. If compiled for a 64-bit CPU it could manipulate them as ints.
    If compiled for a 16-bit CPU it could manipulate them as three
    successive ints.


    The filesystem driver has a fixed limit to the size of the files it
    supports. That is obvious and not in contention. No one is concerned
    about limits in the filesystem driver.

    The issue is the /OS/, which is compiled independently of your plugin filesystem drivers. How is the OS supposed to handle the file size
    limits of the plugin filesystem? One option is that it has a fixed size
    itself that is at least as big as the limit for any plugin filesystem
    (this is how it works in real systems), but that means the OS puts a
    limit on the size the plugin can use, which you don't like. The other
    option is that it has to be able to handle arbitrarily large integer
    sizes determined at run-time by the plugin - arbitrary precision integers.



    I guess in C, "filesystem_handle" will be passed as a void* pointer so
    that the plugin can see exactly which filesystem it is using.

    What do you suggest for the types for the return value of these
    functions?

    Something Dmitry said makes, I think, this easier to explain. In an OO language you could think of the returns from your functions as objects.
    The objects would be correctly sized for the filesystems to which they related. They could have different classes but they would all respond polymorphically to the same methods. The class of each would know the
    maximum permitted offset.


    Right. Arbitrary precision integers. They are written in a nice OO
    language so that the language hides the ugly mechanics of allocations, deallocations, memory management, etc., and gives you operator overloads instead of ugly prefix function calls or macros for everything.

    But they are still arbitrary precision integers if you refuse to allow
    limits. And they are still massively less efficient than a simple large integer of fixed size.


    An object holding a FAT12 offset would be at least 12 bits. An object
    holding a FAT32 offset would be at least 32 bits. An object holding a
    ZFS offset would be ... big enough for the ZFS volume to which it related.

    Incidentally, going back to your concerns about performance, it could
    well take longer to dispatch to a seek method than it would to execute
    it!!! What I have suggested is fast, not slow.


    So what you are saying, is that because most current disks are slow it
    is fine to have an absurdly over-engineered and inefficient system of
    file sizes and offsets? And you want that system because in the future
    you imagine that file sizes could grow indefinitely? And yet you are
    not considering that /current/ fast storage systems are already as fast
    as main memory, and those fast systems are going to be mainstream far
    sooner than 2 ^ 64 byte files will ever be needed.



    An OS written 'properly' should, IMO, be able to run indefinitely and >>>>> should permit the addition of new filesystems which weren't even
    devised
    when the OS was started. Again, the max file size would need to come >>>>> from the FS driver - which would be loadable and unloadable.


    You have a strange idea about what is "properly written" here.  For the >>>> vast majority of OS's that are in use today, having run-time pluggable >>>> file systems would be an insane idea.  OS's are not just limited to
    *nix
    and Windows.

    Why would having loadable filesystem drivers be "an insane idea"?


    Most OS's don't have filesystems at all.  And if they /do/ have one, it
    will usually be dedicated.  Remember, the vast majority of OS's in use
    today are almost certainly unknown to you - they are not PC systems, or
    even mobile phone systems, but embedded device systems.  Supporting
    pluggable filesystems in your smart lightbulb, or car engine controller,
    or bluetooth-connected electric toothbrush /is/ insane.

    That does not answer the question. Toothbrush OSes do not need frame
    buffers - but that does not make frame buffers insane.

    Frame buffers would be an insane idea on the vast majority of OS's. So
    would network stacks, and many other features that are common place on
    "big system" OS's.




    I am not suggesting anything strange; AISI this is basic engineering, >>>>> nothing more.


    /Appropriate/ levels of abstraction and flexibility is basic
    engineering.  /Appropriate/ limits and appropriate sizes for data is
    basic engineering.  Inappropriate generalisations and extrapolations
    is not.


    Again, I have to disagree. The question is: What defines how large a
    file's offset can be?

    The answer is just as simple: Each filesystem has its own range of max
    sizes.


    Your concept of "basic engineering" is severely lacking here.

    Oh? What, specifically, is lacking?


    Supposing someone asked you to build a bridge for a two-lane road
    passing over a river. Basic engineering is to look at the traffic on
    the road, the size of the crossing, and figure out a reasonable maximum
    load weight that could realistically be on the bridge at any given time.
    Then you extrapolate for future growth based on the best available data
    and predictions. Then you multiply and add in safety factors. You tell
    the town planners that the bridge should have a total weight limit of,
    say, 100 tons and you tell them the price.

    That is basic engineering.

    You tell them that you could build it twice the width and supporting 600
    tons, but it would cost ten times as much - it is not worth the cost
    now, and you recommend building a new bridge if and when traffic
    increases that much. It is better to build the smaller and cheaper
    bridge now, than overrun the budget and waste time.

    That is basic engineering.


    But what /you/ want to tell the town planners is that the bridge should
    not have a weight limit - it should be made to support /anything/,
    regardless of the cost. Weight limits should be determined by the axle strength of trucks approaching it, and if someone wants to drive a 500 m
    high truck loaded with lead, they should be able to.

    What would any sane town planner think of that engineer?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Mon Aug 30 09:27:30 2021
    On 29/08/2021 23:10, Bart wrote:

    For this purpose (creating C system header files to go with your own C compiler), you need to end up with an actual set of header files.

    Which existing compilers do you look at for information? Ones like gcc
    have incredibly elaborate headers, full of compiler-specific built-ins
    and attributes and predefined macros.

    If I take stdarg.h as an example, my version of that has this; it
    defines on type, and 5 macros:

    You do know you are cherry-picking perhaps the worst case here, as the <stdargs.h> is very tightly connected to the compiler? It is more
    "language support" than a typical C header that provides some
    declarations for external functions, some types, some constants, and
    perhaps some macros.

    I don't know what you mean by "C system headers", since you invariably
    and knowingly mix up C standard library headers and headers for
    OS-provided libraries. (I'm not going to claim that the separation of compilers, compiler-provided headers and libraries, C standard
    libraries, and OS library headers is clear or a concept that works as
    well on Windows as it did when C was first developed. It's complicated.
    But I /do/ claim you make it worse for yourself.)

    But the C standard library headers are part of the C implementation.
    Some parts can be made very portably, others are tightly tied to the
    details of the compiler.



    --------------------------------------------------------------
    /* Header stdarg.h */

    #ifndef $STDARG
     #define $STDARG

    While it is perfectly allowable to use $ like this, and a conforming C
    program cannot use a $ in other identifiers, the use of $ as a "letter"
    in identifiers is a common extension supported by a lot of C compilers.
    The standard way of getting a "local" identifier in system headers is
    to start them with two underscores, as such names are always reserved
    for such purposes.


     typedef char *    va_list;
     #define va_start(ap,v) ap=((va_list)&v+8)
     #define va_arg(ap,t) *(t*)((ap+=8)-8)
     #define va_copy(dest,src) (dest=src)
     #define va_end(ap)    ( ap = (va_list)0 )
    #endif
    --------------------------------------------------------------


    The one used by tdm/gcc uses those 3 headers shown below (about 300
    lines); I haven't included the 600-line _mingw_h which has yet more
    includes.

    You do realise that in the world of real C compilers, standards and compatibility are important? Thus a significant bulk of these headers
    is to handle the differences between the way different C standards have
    handled VA_LIST and friends. I know you don't care about standards,
    versions or conforming behaviour (that's your choice - if you don't need
    to bother with that for how you will use your compiler, fair enough).
    General purpose C libraries, on the other hand, /do/ need to handle this.


    How do you get from this, to the above? Certainly not by any automatic process!


    That header is completely compiler-specific. So libraries that are
    designed to support multiple compilers and multiple hosts will
    invariably have lots of conditional compilation in their <stdarg.h> header.

    And there is no expectation that such headers could be translated
    automatically in any way, because for use with /your/ compiler you need definitions that match /your/ compiler. There are no definitions for
    your compiler that could be extracted from the header.

    If your compiler were ever to be used by other people, then you could
    talk to the glibc people, or newlib people, or other common C library
    projects, and talk to them about getting support for your compiler into
    their headers.

    Other standard library headers may have similar challenges, though I
    doubt if any are worse than <stdarg.h>. If you are making a C
    implementation, it is your responsibility either to make your own
    standard library and headers, or to work with a new or existing standard library and coordinate these headers. (You can always make just the
    compiler, and leave it to someone else to integrate it with a library to
    make the complete implementation.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Dmitry A. Kazakov on Mon Aug 30 10:03:04 2021
    On 29/08/2021 21:32, Dmitry A. Kazakov wrote:
    On 2021-08-29 21:14, James Harris wrote:
    On 29/08/2021 11:31, Dmitry A. Kazakov wrote:

    Again, scalar is not array. Neither implements another, they are
    already implementations.

    For what we have been discussing an array of three integers is no
    different from a record of three integers;

    Of course they are different. You seems confuse type with its machine representation. Many types may have similar representation, that does
    not make them same.

    nor is either different from an object which holds three integers.
    They are all semantically ONE object which holds THREE integers.

    They are not, because you stated that it is a composite type.

    You can call records and objects arrays rather than scalars if you
    want to but they would be used and passed around as scalars.

    I do not know what does this mean.

    You can pass a value either by value or by reference. By-value passing
    could be done over a machine register.

    Scalar is a type property, opposite to composite type.


    I think James is confusing the way parameters are passed, with the
    property of the type. In particular, he seems to using "scalar" to mean "passed by value as a single object", and "composite" to mean "passed by reference, using a pointer to the start of the object". This is, of
    course, not correct terminology.


    In a simple low-level non-OO language, a scaler type is a fundamental
    type of the language that is not built up of other parts. For C, it is
    defined as an arithmetic type or a pointer type. C++ adds enumeration
    types, pointer to member types, and nullptr_t.

    Aliases and sub-types or sub-ranges of scalar types will also be scalar.

    Aggregate types are any type that is not a scalar type. (These may also
    be referred to as "composite types" in some languages, but in C that
    term refers to a type that can be used to refer to two compatible types.)


    Once you get to higher level languages, it gets more complicated. In
    Python, is an arbitrary precision integer a scalar or a composite? You
    can't break it down within the language, yet the implementation involves
    tree data structures, memory management, etc.

    Even within something like C++ (and presumably Ada, but you know that
    better than I) you can create a type such as "Int256" that behaves
    identically to other integer types that are clearly scalar, while it is
    in fact defined as an aggregate. It is an aggregate, but is used like a scalar.

    So perhaps the scalar/aggregate distinction is not particularly useful.

    In C++, a more useful category is "literal types". These are types that
    can be used as simple fixed types - they don't have complicated
    constructors or destructors, they don't keep any resources (like
    memory), and can be passed around, copied and created as needed. A
    large integer type implemented as a struct or /fixed-size/ array of
    smaller integers would be a literal type.


    How data is passed - by value or reference - is a matter of the platform
    ABI, the source code, and the language design. It is independent of any scalar/aggregate distinction. Scalars can be passed by reference or by
    value. Aggregates likewise - small aggregates are often passed in
    registers.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Mon Aug 30 11:44:49 2021
    On 26/08/2021 11:43, Bart wrote:

    ...

    C shouldn't come into it at all. A 'C type system' isn't really a type system, but just a thin layer over the underlying hardware.

    Actually it works hard to obscure those underlying types without adding anything useful.

    You shouldn't have a go at C's type system until you have tried to adapt
    yours for different word sizes. Your choice of making everything 64-bit
    give you enormous luxuries that were not available in decades past!


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Mon Aug 30 11:36:06 2021
    On 26/08/2021 09:45, Dmitry A. Kazakov wrote:

    ...

    But regarding C headers, once upon a time there existed now completely forgotten language design principle that the program should be all documentation you needed.

    That led to Cobol.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Mon Aug 30 11:34:26 2021
    On 26/08/2021 09:29, David Brown wrote:
    On 26/08/2021 09:33, Dmitry A. Kazakov wrote:
    On 2021-08-26 00:07, Bart wrote:

    ...

    The headers are, in effect, and
    automatic configuration system to match the compiler's needs.

    Exactly. The C headers are there for the sake of the C compiler. They
    are not meant for humans to parse - as Bart seems determined to do!

    In the absence of better information about a given system the C headers
    do at least contain the info Bart requires; it's just that the headers
    need to be processed by software to extract the info.

    ...

    However, this stuff should be that simple. Why dozens of different
    types just to get basic file info? (There were 4 involved with dev_t,
    but 10 different types for the members of struct stat, times 4, is 40!)

    Because there are dozens of different entities involved. Welcome back to
    reality.


    Bart doesn't like that reality. He wants a reality where everything is designed to suit his convenience, and is fine-tuned for the systems he
    uses and nothing else.


    Bart quite reasonably thought that someone else must have written a
    program to extract info from C headers before him. The C preprocessor
    helps but it doesn't go the whole way.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Mon Aug 30 12:16:34 2021
    On 30/08/2021 08:27, David Brown wrote:
    On 29/08/2021 23:10, Bart wrote:

    For this purpose (creating C system header files to go with your own C
    compiler), you need to end up with an actual set of header files.

    Which existing compilers do you look at for information? Ones like gcc
    have incredibly elaborate headers, full of compiler-specific built-ins
    and attributes and predefined macros.

    If I take stdarg.h as an example, my version of that has this; it
    defines on type, and 5 macros:

    You do know you are cherry-picking perhaps the worst case here, as the <stdargs.h> is very tightly connected to the compiler?

    OK, let's take a much more portable feature, since it can be provided by
    an external library, the definition of printf inside stdio.h. Here's
    mine (which naughtily excludes 'extern'):

    int printf(const char*, ...);

    This is the one that comes with the gcc/mingw/tdm headers (you might
    just be able to discern 'printf' somewhere in there!):

    __mingw_ovr __attribute__((__format__ (gnu_printf, 1, 2))) __MINGW_ATTRIB_NONNULL(1)
    int printf (const char *__format, ...)
    {
    int __retval;
    __builtin_va_list __local_argv; __builtin_va_start( __local_argv,
    __format );
    __retval = __mingw_vfprintf( stdout, __format, __local_argv );
    __builtin_va_end( __local_argv );
    return __retval;
    }

    Here, I decided it was better to look at an alternate source of info.


    It is more
    "language support" than a typical C header that provides some
    declarations for external functions, some types, some constants, and
    perhaps some macros.

    I don't know what you mean by "C system headers", since you invariably
    and knowingly mix up C standard library headers and headers for
    OS-provided libraries.

    I'm making a distinction between general headers that could be processed
    by any compiler (in theory), and those designed for a specific
    implementation.

    I don't know where POSIX ones fall (since on those OSes that support it,
    I think they are provided by the OS), or ones that start with <sys/...>,
    but a few of those are commonly used by applications so I provide them, sometimes skeleton ones (eg. unistd.h) in order to be able to build a
    specific app.

    #ifndef $STDARG
     #define $STDARG

    While it is perfectly allowable to use $ like this, and a conforming C program cannot use a $ in other identifiers, the use of $ as a "letter"
    in identifiers is a common extension supported by a lot of C compilers.

    This is in code that should only be processed by my compiler. If it
    clashes with user-code, that's too bad (but the compiler has worse
    problems compiling arbitrary code).

    Note that Tiny C doesn't allow $ at all, so such code wouldn't work with
    that either.

    The standard way of getting a "local" identifier in system headers is
    to start them with two underscores, as such names are always reserved
    for such purposes.

    I don't like underscores because their visibility, already poor, and
    whether multiple underscores blend into each other or not, depends on font.

    The one used by tdm/gcc uses those 3 headers shown below (about 300
    lines); I haven't included the 600-line _mingw_h which has yet more
    includes.

    You do realise that in the world of real C compilers, standards and compatibility are important?

    Yeah... but perhaps you also realise that system headers are, or should
    be, specific to an implementation?

    Trying to make a single header work for myriad targets sounds an
    admirable idea; in practice it makes for nightmare code.

    I'd rather there were half-a-dozen compact, clean-written versions of a
    header (of which you'd only see one anyway), than one that tries to do
    all six variants which is likely to be more than six times the combined
    size.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Mon Aug 30 12:45:26 2021
    On 30/08/2021 11:44, James Harris wrote:
    On 26/08/2021 11:43, Bart wrote:

    ...

    C shouldn't come into it at all. A 'C type system' isn't really a type
    system, but just a thin layer over the underlying hardware.

    Actually it works hard to obscure those underlying types without
    adding anything useful.

    You shouldn't have a go at C's type system until you have tried to adapt yours for different word sizes. Your choice of making everything 64-bit
    give you enormous luxuries that were not available in decades past!



    64-bit is simply using the native word size of the processor.

    With x86-32 my 'int' was 32 bits. With 8086, it was 16 bits. (With Z80,
    it was also 16 bits, not the word size, but there it was necessary.)

    C has made the same progression of its 'int' from 16 bits to 32 bits,
    but for some reason decided not to proceed to 64 bits. (Maybe nobody
    could decide whether 'short' should be 16 or 32 bits, and what to call
    the other size!)

    I've also had my own mini-zoo of types other than byte, int and real
    (such as sint for i16 when int was i32). But I have also had, since the
    very beginning almost, size-specific forms written as:

    byte*N Unsigned
    int*N Signed

    where N was the number of 8-bit bytes in the type, and needed to be one
    of 1/2/4/8. The upper limit for N changed as machines allowed for wider
    types; it was usually double the native word size.

    C didn't get anything like that until 25 years after it was created, and
    then it was the form of an ungainly bolt-on written in user-code, poorly supported by the language.

    However I've dropped those *N (and briefly :N) formats when I decided to simplify my syntax.

    You can argue that C was created when there were more diverse word-sizes
    and fewer byte-addressed machines, but it should have been obvious a few
    years after its creation, which way things were going. It had plenty of opportunity to get things in order.

    I created my first language for one specific machine (the 8-bit Z80),
    which turned out to be a good choice! (I didn't want to be dealing with
    packed and unpacked char-arrays on word-addressed hardware.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Mon Aug 30 17:29:59 2021
    On 25/08/2021 09:29, David Brown wrote:

    ...

    I read somewhere (but can't find the reference or quotation) that the strength of a programming language comes not from the features it has,
    but the restrictions it has.

    Haven't heard that one but it's similar to "A language design is
    finished not when there's no more to add but when there's no more to
    take away."

    Thank Antoine:

    https://www.brainyquote.com/quotes/antoine_de_saintexupery_103610

    Due to the potential for standard libraries his comment arguably applies
    more to a programming language than it does to almost anything else.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Mon Aug 30 18:41:19 2021
    On 30/08/2021 17:29, James Harris wrote:
    On 25/08/2021 09:29, David Brown wrote:

    ...

    I read somewhere (but can't find the reference or quotation) that the
    strength of a programming language comes not from the features it has,
    but the restrictions it has.

    Haven't heard that one but it's similar to "A language design is
    finished not when there's no more to add but when there's no more to
    take away."

    Thank Antoine:

      https://www.brainyquote.com/quotes/antoine_de_saintexupery_103610

    Due to the potential for standard libraries his comment arguably applies
    more to a programming language than it does to almost anything else.


    If applied to the English language, then it's possible he may not have
    been able express his comment!

    (On one list of 2000 basic English words, 'perfection' and 'achieved'
    don't appear. Neither does 'add'.)

    And applied to a programming language, I shudder to thing what you'd end
    up with. Probably something like Brainf*ck.

    A language needs to be utilitarian first. Aesthetics comes into it too,
    but it doesn't mean the same as minimalism.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Mon Aug 30 20:06:18 2021
    On 30/08/2021 18:29, James Harris wrote:
    On 25/08/2021 09:29, David Brown wrote:

    ...

    I read somewhere (but can't find the reference or quotation) that the
    strength of a programming language comes not from the features it has,
    but the restrictions it has.

    Haven't heard that one but it's similar to "A language design is
    finished not when there's no more to add but when there's no more to
    take away."


    That's similar, and also useful to keep in mind. (But don't forget that
    pithy sayings are rarely intended to be applied literally.)

    There is also a quotation from Bjarne Stroustrup along the lines of
    every complicated language having a simple language at its core
    struggling to get out. That is true of C++ at least - features have
    been added in each generation of the language that make the language
    more complex, but result in people being able to write simpler code.


    Thank Antoine:

      https://www.brainyquote.com/quotes/antoine_de_saintexupery_103610

    Due to the potential for standard libraries his comment arguably applies
    more to a programming language than it does to almost anything else.



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Mon Aug 30 20:03:16 2021
    On 30/08/2021 13:16, Bart wrote:
    On 30/08/2021 08:27, David Brown wrote:
    On 29/08/2021 23:10, Bart wrote:

    For this purpose (creating C system header files to go with your own C
    compiler), you need to end up with an actual set of header files.

    Which existing compilers do you look at for information? Ones like gcc
    have incredibly elaborate headers, full of compiler-specific built-ins
    and attributes and predefined macros.

    If I take stdarg.h as an example, my version of that has this; it
    defines on type, and 5 macros:

    You do know you are cherry-picking perhaps the worst case here, as the
    <stdargs.h> is very tightly connected to the compiler?

    OK, let's take a much more portable feature, since it can be provided by
    an external library, the definition of printf inside stdio.h. Here's
    mine (which naughtily excludes 'extern'):

       int printf(const char*, ...);

    The "extern" is not necessary for function declarations, because it is
    the default and it is obviously distinguishable from a definition
    (unlike for variables). For the record, I think the "default extern"
    concept is a truly terrible idea, and one that C got wrong from the
    start without any good excuses. Identifiers in a language should always
    be at their enclosing scope (such as file scope), unless you explicitly
    choose to make them wider.


    This is the one that comes with the gcc/mingw/tdm headers (you might
    just be able to discern 'printf' somewhere in there!):

    __mingw_ovr __attribute__((__format__ (gnu_printf, 1, 2))) __MINGW_ATTRIB_NONNULL(1)
    int printf (const char *__format, ...)
    {
      int __retval;
      __builtin_va_list __local_argv; __builtin_va_start( __local_argv,
    __format );
      __retval = __mingw_vfprintf( stdout, __format, __local_argv );
      __builtin_va_end( __local_argv );
      return __retval;
    }

    Here, I decided it was better to look at an alternate source of info.


    It is a definition in a header file, not a source of information. And
    it seems pretty clear to me, except for the "__ming_ovr". I guessed it
    was a macro that included "inline", but had to look up the details.
    However, this is with the benefit of being familiar with gcc attributes,
    and with having read a little about how mingw handles printf
    (ironically, I only know that because I looked it up in connection with
    some of your posts about C99 printf handling).

    The attribute parts at the start are to inform the compiler that this
    function expects a printf-style format string at the start (parameter 1)
    and that the parameters used by it are from parameter number 2 onwards.
    This lets the compiler warn you if you write "printf("%i\n",
    "hello");", or other such type errors. It will also warn you if
    parameter 1 is null.

    The rest is about taking the variadic parameters and passing them as a
    single va_list to the vfprintf function. This is a very common way of implementing the various printf functions in a library - you implement a
    few general functions, and use them for the more specific ones.


      It is more
    "language support" than a typical C header that provides some
    declarations for external functions, some types, some constants, and
    perhaps some macros.

    I don't know what you mean by "C system headers", since you invariably
    and knowingly mix up C standard library headers and headers for
    OS-provided libraries.

    I'm making a distinction between general headers that could be processed
    by any compiler (in theory), and those designed for a specific implementation.

    I don't know where POSIX ones fall (since on those OSes that support it,
    I think they are provided by the OS), or ones that start with <sys/...>,
    but a few of those are commonly used by applications so I provide them, sometimes skeleton ones (eg. unistd.h) in order to be able to build a specific app.

    POSIX headers can fall somewhat in the middle here - they should be
    useable by any C compiler that will work on the system, but because
    POSIX makes certain requirements of the compiler, the headers can rely
    on those. (As a simple example, any POSIX header can rely on any of the
    other POSIX headers being available in standard places.)

    It's not particularly neat that POSIX headers, other OS headers,
    standard C headers, and perhaps other library headers are often all
    collected in the same directories.


    #ifndef $STDARG
      #define $STDARG

    While it is perfectly allowable to use $ like this, and a conforming C
    program cannot use a $ in other identifiers, the use of $ as a "letter"
    in identifiers is a common extension supported by a lot of C compilers.

    This is in code that should only be processed by my compiler. If it
    clashes with user-code, that's too bad (but the compiler has worse
    problems compiling arbitrary code).

    As I said, it's fine - I am merely pointing out the common practice.


    Note that Tiny C doesn't allow $ at all, so such code wouldn't work with
    that either.

      The standard way of getting a "local" identifier in system headers is
    to start them with two underscores, as such names are always reserved
    for such purposes.

    I don't like underscores because their visibility, already poor, and
    whether multiple underscores blend into each other or not, depends on font.

    The one used by tdm/gcc uses those 3 headers shown below (about 300
    lines); I haven't included the 600-line _mingw_h which has yet more
    includes.

    You do realise that in the world of real C compilers, standards and
    compatibility are important?

    Yeah... but perhaps you also realise that system headers are, or should
    be, specific to an implementation?

    Trying to make a single header work for myriad targets sounds an
    admirable idea; in practice it makes for nightmare code.

    There is a balance here. Writing huge numbers of compiler or
    target-specific headers and library files is also a nightmare.
    Remember, these files support multiple OS's, multiple targets, multiple
    C standards, and often a range of variations of the standards,
    optimisations for multiple different compilers, and possibly other
    compiler options. Each of these dimensions must be multiplied together
    to get the full range of combinations, which could easily reach
    hundreds. And many of them need to be available in a given system.


    I'd rather there were half-a-dozen compact, clean-written versions of a header (of which you'd only see one anyway), than one that tries to do
    all six variants which is likely to be more than six times the combined
    size.


    If you think about it enough, you'll see the combinations are far
    higher. I do appreciate what you want to see here, and why, but it is
    not feasible. (That is not to say it could not be handled a bit better,
    or that the headers could not benefit from a bit of pruning and tidying
    of outdated combinations.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Mon Aug 30 21:09:41 2021
    On 30/08/2021 12:45, Bart wrote:
    On 30/08/2021 11:44, James Harris wrote:
    On 26/08/2021 11:43, Bart wrote:

    ...

    C shouldn't come into it at all. A 'C type system' isn't really a
    type system, but just a thin layer over the underlying hardware.

    ...

    You shouldn't have a go at C's type system until you have tried to
    adapt yours for different word sizes. Your choice of making everything
    64-bit give you enormous luxuries that were not available in decades
    past!



    64-bit is simply using the native word size of the processor.

    I'm sure I've seen you espousing it for other things, too. ;-)

    ...

    You can argue that C was created when there were more diverse word-sizes
    and fewer byte-addressed machines, but it should have been obvious a few years after its creation, which way things were going. It had plenty of opportunity to get things in order.

    My point was that even today C can be used to write programs for
    machines with word sizes you would hate whereas you've been able to
    avoid the complexity of all those issues by simply avoiding supporting
    such machines. So don't be too hard on C's type system. Most of it is
    there for a reason.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Mon Aug 30 21:50:29 2021
    On 30/08/2021 09:03, David Brown wrote:
    On 29/08/2021 21:32, Dmitry A. Kazakov wrote:

    ...

    Scalar is a type property, opposite to composite type.


    I think James is confusing the way parameters are passed, with the
    property of the type. In particular, he seems to using "scalar" to mean "passed by value as a single object", and "composite" to mean "passed by reference, using a pointer to the start of the object". This is, of
    course, not correct terminology.

    No, not at all. First of all, it may have been Dmitry who brought in the
    term "composite". Second, the situation doesn't change according to
    whether an object is passed by value or by reference.



    In a simple low-level non-OO language, a scaler type is a fundamental
    type of the language that is not built up of other parts. For C, it is defined as an arithmetic type or a pointer type. C++ adds enumeration
    types, pointer to member types, and nullptr_t.

    Are you saying the C standards have an official definition of 'scalar'
    (which is along the lines you mention)?


    Aliases and sub-types or sub-ranges of scalar types will also be scalar.

    Aggregate types are any type that is not a scalar type. (These may also
    be referred to as "composite types" in some languages, but in C that
    term refers to a type that can be used to refer to two compatible types.)


    Once you get to higher level languages, it gets more complicated. In
    Python, is an arbitrary precision integer a scalar or a composite? You
    can't break it down within the language, yet the implementation involves
    tree data structures, memory management, etc.

    Even within something like C++ (and presumably Ada, but you know that
    better than I) you can create a type such as "Int256" that behaves identically to other integer types that are clearly scalar, while it is
    in fact defined as an aggregate. It is an aggregate, but is used like a scalar.

    So perhaps the scalar/aggregate distinction is not particularly useful.

    I see scalars as objects which can be passed to a routine, returned from
    a routine and operated on as whole units. At that level they are
    scalars. If they can be subdivided by a routine which chooses to do so
    then they won't be scalars /in such code/.

    At the end of the day, if we are talking about digital computing all
    objects other than a single bit could be treated as aggregates.

    So scalar is arguably a semantic concept, reflecting how an object is processed.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Mon Aug 30 21:38:07 2021
    On 29/08/2021 20:32, Dmitry A. Kazakov wrote:
    On 2021-08-29 21:14, James Harris wrote:
    On 29/08/2021 11:31, Dmitry A. Kazakov wrote:

    Again, scalar is not array. Neither implements another, they are
    already implementations.

    For what we have been discussing an array of three integers is no
    different from a record of three integers;

    Of course they are different. You seems confuse type with its machine representation. Many types may have similar representation, that does
    not make them same.

    You missed the context. For what we have been discussing they are all
    effective ways to pass around the required data. Semantically they are identical. They differ only in terms of implementation.


    nor is either different from an object which holds three integers.
    They are all semantically ONE object which holds THREE integers.

    They are not, because you stated that it is a composite type.

    Oh, and if I had not said that would it change anything?

    I don't know that I made that claim but all three are as stated, above, whatever claim was made by anyone.

    I am puzzled as to why you are pursuing this. Maybe it's an Ada thing
    but in C if you had

    struct rec {
    int a;
    int b;
    int c;
    };

    then objects of that type could be treated as units and passed around as necessary. Would you call such objects scalars or arrays?

    If you say they are not scalars then what about

    struct rec2 {
    int a;
    };

    and what about

    int a;

    ?

    I would call them all scalars /semantically/ because they would be
    treated as units and dealt with as such. But it doesn't matter if YMV.


    You can call records and objects arrays rather than scalars if you
    want to but they would be used and passed around as scalars.

    I do not know what does this mean.

    You can pass a value either by value or by reference. By-value passing
    could be done over a machine register.

    Scalar is a type property, opposite to composite type.


    As I say, this is all about definitions. And when have we ever disagreed
    about definitions before, eh? ;-)


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Mon Aug 30 23:25:46 2021
    On 30/08/2021 21:09, James Harris wrote:
    On 30/08/2021 12:45, Bart wrote:

    You can argue that C was created when there were more diverse
    word-sizes and fewer byte-addressed machines, but it should have been
    obvious a few years after its creation, which way things were going.
    It had plenty of opportunity to get things in order.

    My point was that even today C can be used to write programs for
    machines with word sizes you would hate whereas you've been able to
    avoid the complexity of all those issues by simply avoiding supporting
    such machines. So don't be too hard on C's type system. Most of it is
    there for a reason.

    OK. It all depends on how much you want to emulate C in that regard, and
    so inherit a lot of the same mess in its type system.

    I suggested a few years ago that C should be split into two languages,
    one that continues to target all odd-ball systems past, present and
    future, and one that targets the same desktop-class machines that most
    other languages seem to have settled on, including mine.

    Namely, ones like D, Java, Julia, Dart, Rust, Odin, C#, Nim, Go ...

    But it seems that you are keen to cover everything in one language, from
    the tiniest microcontrollers, through desktop PCs and current
    supercomputers, and up to massively large machines that need 128 bits to specify file sizes.

    I'd say that's being a little ambitious. (Are you still writing OSes too?)

    My own aims (now that I've dismissed 32 bit machines) are to target:

    Windows on x64
    Linux on x64
    Linux on arm64

    (I no longer work with bare-metal processors so the machines will be
    running some OS like the above.)

    If I ever feel a need to work with a smaller device again, then probably
    I can create a custom version of my language, with smaller default int
    and pointer types, and caps on the widest types available. I don't
    expect to be able to run an arbitrary program on any target system, but
    some will work on do so.

    To do all this, I don't need C or to adopt the vagaries and conflicts of
    C's type system.


    you've been able to
    avoid the complexity of all those issues by simply avoiding supporting
    such machines.

    It wasn't my job to support every possible machine. I wrote code for the machines we created, or later for the business machines available for
    customers to buy, which usually meant PCs. So I made implementations for
    those machines.

    Altogether I written compilers that directly targeted:

    PDP10 36-bit word-addressed mainframe
    Z80 8-bit micro
    8086 16-bit
    80386 etc 32-bit
    x64 64-bit

    Just not all at the same time as I see it as a progression. (I've also investigated implementing my languages on a range of devices, eg.
    Motorola and NatSemi, that never got to a prototype.)

    Note that even C compilers for smaller devices such as Z80 can be rather specialist products.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Tue Aug 31 09:16:31 2021
    On 30/08/2021 22:09, James Harris wrote:
    On 30/08/2021 12:45, Bart wrote:
    On 30/08/2021 11:44, James Harris wrote:
    On 26/08/2021 11:43, Bart wrote:

    ...

    C shouldn't come into it at all. A 'C type system' isn't really a
    type system, but just a thin layer over the underlying hardware.

    ...

    You shouldn't have a go at C's type system until you have tried to
    adapt yours for different word sizes. Your choice of making
    everything 64-bit give you enormous luxuries that were not available
    in decades past!



    64-bit is simply using the native word size of the processor.

    I'm sure I've seen you espousing it for other things, too. ;-)


    If you want to limit types to one single size of integer (except perhaps
    for FFI or large arrays), then 64-bit is a good choice unless you need
    to be as efficient as possible on small embedded systems. All modern
    PC's or bigger embedded systems (phones, etc.) handle 64-bit natively,
    so smaller sizes are not faster for general use. Outside of mathematics
    and applications like cryptography, it is rare for an integer to be
    bigger than 2 ^ 31. But it is extremely rare to need values bigger than
    2 ^ 63. So for practical purposes, you can usually treat a 64-bit
    integer as "unlimited".

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Tue Aug 31 09:32:33 2021
    On 30/08/2021 22:50, James Harris wrote:
    On 30/08/2021 09:03, David Brown wrote:
    On 29/08/2021 21:32, Dmitry A. Kazakov wrote:

    ...

    Scalar is a type property, opposite to composite type.


    I think James is confusing the way parameters are passed, with the
    property of the type.  In particular, he seems to using "scalar" to mean
    "passed by value as a single object", and "composite" to mean "passed by
    reference, using a pointer to the start of the object".  This is, of
    course, not correct terminology.

    No, not at all. First of all, it may have been Dmitry who brought in the
    term "composite". Second, the situation doesn't change according to
    whether an object is passed by value or by reference.


    It is quite possible that Ada uses the term "composite" in the way
    Dmitry used it - but C does not. It's worth being aware of the ambiguity.



    In a simple low-level non-OO language, a scaler type is a fundamental
    type of the language that is not built up of other parts.  For C, it is
    defined as an arithmetic type or a pointer type.  C++ adds enumeration
    types, pointer to member types, and nullptr_t.

    Are you saying the C standards have an official definition of 'scalar'
    (which is along the lines you mention)?


    Yes. 6.2.5p21 in C17. There are official drafts of the C standards
    available freely (the actual published ISO standards cost money, but the
    final drafts are free and just as good). Google for C17 draft, document
    N2176, for example.


    Aliases and sub-types or sub-ranges of scalar types will also be scalar.

    Aggregate types are any type that is not a scalar type.  (These may also
    be referred to as "composite types" in some languages, but in C that
    term refers to a type that can be used to refer to two compatible types.)


    Once you get to higher level languages, it gets more complicated.  In
    Python, is an arbitrary precision integer a scalar or a composite?  You
    can't break it down within the language, yet the implementation involves
    tree data structures, memory management, etc.

    Even within something like C++ (and presumably Ada, but you know that
    better than I) you can create a type such as "Int256" that behaves
    identically to other integer types that are clearly scalar, while it is
    in fact defined as an aggregate.  It is an aggregate, but is used like a
    scalar.

    So perhaps the scalar/aggregate distinction is not particularly useful.

    I see scalars as objects which can be passed to a routine, returned from
    a routine and operated on as whole units. At that level they are
    scalars. If they can be subdivided by a routine which chooses to do so
    then they won't be scalars /in such code/.


    That is not an unreasonable definition - but it differs from that used
    by C (and I don't know if Ada has a definition).

    At the end of the day, if we are talking about digital computing all
    objects other than a single bit could be treated as aggregates.

    So scalar is arguably a semantic concept, reflecting how an object is processed.



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Tue Aug 31 09:26:52 2021
    On 31/08/2021 00:25, Bart wrote:
    On 30/08/2021 21:09, James Harris wrote:
    On 30/08/2021 12:45, Bart wrote:

    You can argue that C was created when there were more diverse
    word-sizes and fewer byte-addressed machines, but it should have been
    obvious a few years after its creation, which way things were going.
    It had plenty of opportunity to get things in order.

    My point was that even today C can be used to write programs for
    machines with word sizes you would hate whereas you've been able to
    avoid the complexity of all those issues by simply avoiding supporting
    such machines. So don't be too hard on C's type system. Most of it is
    there for a reason.

    Indeed - there are plenty of 8-bit and 16-bit systems in common use.
    There are also huge numbers of devices with weirder sizes, such as 18
    bit or 24 bit, and often CHAR_BIT is bigger than 8. There are very few
    people who /program/ these, but they are found in many appliances and
    dedicated chips. (Often the electronics designers using them don't know
    they are programmable cores - they buy a chip for driving a screen, or
    for re-sampling audio streams, without realising there is a DSP core
    inside.)

    There is a good argument for saying some support for the most unusual
    systems should be dropped from newer C standards. After all, the tools
    used on these devices are very specialised and the software on them is
    niche - there is no need for portability here. C2x is expected to drop
    support for padding bits in integers (except _Bool) and require two's complement signed integer format (but of course it will not require
    wrapping overflow, because that would be silly).


    A fair bit of the way C's types work is the result of history - gradual
    changes and backwards compatibility. That is a good reason, but it does
    mean that the final results are not objectively the "best" choice if you
    were to start from scratch, targeting modern systems. (However, while
    most C programmers agree that the types, promotion rules, etc., are not
    ideal, they will not agree on what they would have liked to be different!)


    OK. It all depends on how much you want to emulate C in that regard, and
    so inherit a lot of the same mess in its type system.

    I suggested a few years ago that C should be split into two languages,
    one that continues to target all odd-ball systems past, present and
    future, and one that targets the same desktop-class machines that most
    other languages seem to have settled on, including mine.

    Namely, ones like D, Java, Julia, Dart, Rust, Odin, C#, Nim, Go ...

    But it seems that you are keen to cover everything in one language, from
    the tiniest microcontrollers, through desktop PCs and current
    supercomputers, and up to massively large machines that need 128 bits to specify file sizes.

    I'd say that's being a little ambitious. (Are you still writing OSes too?)


    I'd agree - I see no use or benefit of a language covering that range.
    And by the time it is ready for use, the targets will have changed.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to David Brown on Tue Aug 31 11:21:52 2021
    On 2021-08-31 09:16, David Brown wrote:

    If you want to limit types to one single size of integer (except perhaps
    for FFI or large arrays), then 64-bit is a good choice unless you need
    to be as efficient as possible on small embedded systems. All modern
    PC's or bigger embedded systems (phones, etc.) handle 64-bit natively,
    so smaller sizes are not faster for general use. Outside of mathematics
    and applications like cryptography, it is rare for an integer to be
    bigger than 2 ^ 31. But it is extremely rare to need values bigger than
    2 ^ 63. So for practical purposes, you can usually treat a 64-bit
    integer as "unlimited".

    128-bit is appears in networking protocols, right, because of
    cryptography. There are other formats for which 128-bit is required,
    e.g. IEEE decimal numbers have mantissa longer than 64-bit.

    I think that the language should allow any range. If the target does not support the particular range, the compiler should deploy a library implementation, giving a warning.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Dmitry A. Kazakov on Tue Aug 31 12:57:41 2021
    On 31/08/2021 11:21, Dmitry A. Kazakov wrote:
    On 2021-08-31 09:16, David Brown wrote:

    If you want to limit types to one single size of integer (except perhaps
    for FFI or large arrays), then 64-bit is a good choice unless you need
    to be as efficient as possible on small embedded systems.  All modern
    PC's or bigger embedded systems (phones, etc.) handle 64-bit natively,
    so smaller sizes are not faster for general use.  Outside of mathematics
    and applications like cryptography, it is rare for an integer to be
    bigger than 2 ^ 31.  But it is extremely rare to need values bigger than
    2 ^ 63.  So for practical purposes, you can usually treat a 64-bit
    integer as "unlimited".

    128-bit is appears in networking protocols, right, because of
    cryptography.

    No, not really.

    As I said, there are bigger integers used in cryptography. But these
    are often /much/ bigger - at least 256 bits, and perhaps up to 4096
    bits. You don't use a normal plain integer type there.

    There are also lots of other uses for bigger types - IPv6 addresses are
    128 bits, for example, but again these are not integers.

    There are other formats for which 128-bit is required,
    e.g. IEEE decimal numbers have mantissa longer than 64-bit.

    Again, these are not integers.

    The point is that when you want a /number/ - an integer - then 64-bit is
    more than sufficient for most purposes while also not being too big to
    handle efficiently on most modern cpus. You could almost say that about 32-bit, but definitely say it now we have 64-bit. This is why no one
    has bothered making a 128-bit processor - there simply isn't any use for
    such large numbers as plain integers.


    I think that the language should allow any range. If the target does not support the particular range, the compiler should deploy a library implementation, giving a warning.


    Don't misunderstand me here - there are use-cases for all kinds of sizes
    and types. A language that intends to be efficient is going to have to
    be able to handle arrays of smaller types. And there has to be ways to
    handle bigger types for their occasional usage. The point is that there
    is little need for multiple sizes for "normal" everyday usage - a signed
    64-bit type really will work for practically everything.

    (I am a big fan of strong types, and sub-range types that have limited
    valid ranges - these make it easier to write clear, correct code, much
    harder to write incorrect code, and provide scope for static and dynamic
    error checking. But whether or not a language should support and
    encourage that is a different matter from the sizes you want for common
    use.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Tue Aug 31 12:36:11 2021
    On 31/08/2021 10:21, Dmitry A. Kazakov wrote:
    On 2021-08-31 09:16, David Brown wrote:

    If you want to limit types to one single size of integer (except perhaps
    for FFI or large arrays), then 64-bit is a good choice unless you need
    to be as efficient as possible on small embedded systems.  All modern
    PC's or bigger embedded systems (phones, etc.) handle 64-bit natively,
    so smaller sizes are not faster for general use.  Outside of mathematics
    and applications like cryptography, it is rare for an integer to be
    bigger than 2 ^ 31.  But it is extremely rare to need values bigger than
    2 ^ 63.  So for practical purposes, you can usually treat a 64-bit
    integer as "unlimited".

    128-bit is appears in networking protocols, right, because of
    cryptography. There are other formats for which 128-bit is required,
    e.g. IEEE decimal numbers have mantissa longer than 64-bit.

    I think that the language should allow any range. If the target does not support the particular range, the compiler should deploy a library implementation, giving a warning.


    Using a 64-bit default integer type does not preclude having a 128-bit
    type too (which is exactly what I have).

    Some uses for 128-bit are not necessarily about representing numbers
    that range from 0 to 2**127-1 or 2**128-1; they just need that number of
    bits, perhaps as a descriptor, but one where it is convenient to be able
    to apply bitwise operations to.

    Then specifying that type as a range of values is not really appropriate.

    (I would also dispute the claim that integers needing more than 32 bits
    are rare enough that the default - on 64-BIT MACHINES - should be 32 bits.

    Probably the stats would show that most 32-bit integer values would fit
    into 16 bits or even 8 (there are a lot of 0s around!), but you don't
    really want a default 16-bit integer and have to constantly keep in mind whether you values will be below 16 bits or above. Same with int32.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to David Brown on Tue Aug 31 13:16:18 2021
    On 2021-08-31 12:57, David Brown wrote:
    On 31/08/2021 11:21, Dmitry A. Kazakov wrote:
    On 2021-08-31 09:16, David Brown wrote:

    If you want to limit types to one single size of integer (except perhaps >>> for FFI or large arrays), then 64-bit is a good choice unless you need
    to be as efficient as possible on small embedded systems.  All modern
    PC's or bigger embedded systems (phones, etc.) handle 64-bit natively,
    so smaller sizes are not faster for general use.  Outside of mathematics >>> and applications like cryptography, it is rare for an integer to be
    bigger than 2 ^ 31.  But it is extremely rare to need values bigger than >>> 2 ^ 63.  So for practical purposes, you can usually treat a 64-bit
    integer as "unlimited".

    128-bit is appears in networking protocols, right, because of
    cryptography.

    No, not really.

    As I said, there are bigger integers used in cryptography. But these
    are often /much/ bigger - at least 256 bits, and perhaps up to 4096
    bits. You don't use a normal plain integer type there.

    You mean computations. In protocols 128-bit is used for keys,
    certificate, hashes stuff.

    There are other formats for which 128-bit is required,
    e.g. IEEE decimal numbers have mantissa longer than 64-bit.

    Again, these are not integers.

    It is two integers combined into a floating-point number. 128-bit
    integer is needed to encode/decode it into something digestible.

    I think that the language should allow any range. If the target does not
    support the particular range, the compiler should deploy a library
    implementation, giving a warning.

    Don't misunderstand me here - there are use-cases for all kinds of sizes
    and types. A language that intends to be efficient is going to have to
    be able to handle arrays of smaller types. And there has to be ways to handle bigger types for their occasional usage. The point is that there
    is little need for multiple sizes for "normal" everyday usage - a signed 64-bit type really will work for practically everything.

    I do not care about sizes, I do about the ranges. I do not want to
    consider if the target supports the range I need. I just do not care, I
    want the compiler to implement my requirement. If that means emulation,
    OK, what is the alternative anyway?

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Tue Aug 31 12:24:18 2021
    On 30/08/2021 21:09, James Harris wrote:
    On 30/08/2021 12:45, Bart wrote:
    On 30/08/2021 11:44, James Harris wrote:
    On 26/08/2021 11:43, Bart wrote:

    ...

    C shouldn't come into it at all. A 'C type system' isn't really a
    type system, but just a thin layer over the underlying hardware.

    ...

    You shouldn't have a go at C's type system until you have tried to
    adapt yours for different word sizes. Your choice of making
    everything 64-bit give you enormous luxuries that were not available
    in decades past!



    64-bit is simply using the native word size of the processor.

    I'm sure I've seen you espousing it for other things, too. ;-)

    ...

    You can argue that C was created when there were more diverse
    word-sizes and fewer byte-addressed machines, but it should have been
    obvious a few years after its creation, which way things were going.
    It had plenty of opportunity to get things in order.

    My point was that even today C can be used to write programs for
    machines with word sizes you would hate whereas you've been able to
    avoid the complexity of all those issues by simply avoiding supporting
    such machines. So don't be too hard on C's type system. Most of it is
    there for a reason.



    Have a look at this benchmark:

    https://github.com/sal55/langs/blob/master/bench.c

    (Notice the comment that the code was modified for 64 bits.)

    If I try and run this on 64-bit Windows, it crashes. gcc will kindly
    point out the reason: it uses both 'int' and 'long', and here assumes
    that long is wide enough (presumably wider than int) to store a pointer
    value.

    I can fix it by changing 'long' to 'long long'. (Although for the
    purpose here, probably intptr_t would be more apt.)

    You might argue that it is Windows' or MS' fault for not using a more
    sensible width of 'long', but the fact is that C ALLOWS all these
    disparate versions of types such as 'long'.

    It might be the same width as 'int'; it might be double the width; it
    might be anything so long as it's not smaller than 'int'.

    So there is a clear advantage these days in an 'int' type that grows
    with the word size and is the same size as an address. Although this
    benchmark might have then gone wrong in assuming that 'int' was 32 bits.

    Yes, it was probably poorly written too. But the language doesn't help.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Tue Aug 31 15:31:28 2021
    On 31/08/2021 13:24, Bart wrote:
    On 30/08/2021 21:09, James Harris wrote:
    On 30/08/2021 12:45, Bart wrote:
    On 30/08/2021 11:44, James Harris wrote:
    On 26/08/2021 11:43, Bart wrote:

    ...

    C shouldn't come into it at all. A 'C type system' isn't really a
    type system, but just a thin layer over the underlying hardware.

    ...

    You shouldn't have a go at C's type system until you have tried to
    adapt yours for different word sizes. Your choice of making
    everything 64-bit give you enormous luxuries that were not available
    in decades past!



    64-bit is simply using the native word size of the processor.

    I'm sure I've seen you espousing it for other things, too. ;-)

    ...

    You can argue that C was created when there were more diverse
    word-sizes and fewer byte-addressed machines, but it should have been
    obvious a few years after its creation, which way things were going.
    It had plenty of opportunity to get things in order.

    My point was that even today C can be used to write programs for
    machines with word sizes you would hate whereas you've been able to
    avoid the complexity of all those issues by simply avoiding supporting
    such machines. So don't be too hard on C's type system. Most of it is
    there for a reason.



    Have a look at this benchmark:

      https://github.com/sal55/langs/blob/master/bench.c

    (Notice the comment that the code was modified for 64 bits.)

    If I try and run this on 64-bit Windows, it crashes. gcc will kindly
    point out the reason: it uses both 'int' and 'long', and here assumes
    that long is wide enough (presumably wider than int) to store a pointer value.

    If you cast a pointer to a "long", you deserve all the problems you get.
    Why on earth would you thing that is a good idea?


    I can fix it by changing 'long' to 'long long'. (Although for the
    purpose here, probably intptr_t would be more apt.)

    Fix it by using pointers as pointers, and don't faff around with casting
    them into random integer types! If you desperately need to hold
    different types of pointers in the same variable (though it is often an indication of poor program structure), use "void *".

    But you've got code there that casts pointers into longs, and then
    compares them to an int...


    You might argue that it is Windows' or MS' fault for not using a more sensible width of 'long', but the fact is that C ALLOWS all these
    disparate versions of types such as 'long'.


    No, the fault here is the programmer's alone.

    There /are/ cases where it makes sense to turn pointers into integers,
    because you want to do weird things such as compact storage in large
    arrays, or low-level stuff like checking alignments. The type
    "uintptr_t" has existed for that purpose for 20 years. The C
    preprocessor to check type sizes and make your own local equivalent to "uintptr_t" has existed for 50 years. There has /never/ been an excuse
    to cast a pointer to an int, or a long, or other guessed type and think
    it is a good idea.

    You are not alone in writing bad code like this, but that does not
    excuse you.


    It might be the same width as 'int'; it might be double the width; it
    might be anything so long as it's not smaller than 'int'.

    So there is a clear advantage these days in an 'int' type that grows
    with the word size and is the same size as an address. Although this benchmark might have then gone wrong in assuming that 'int' was 32 bits.


    No, there is no sense in having the size of "int" related to the size of
    a pointer, because the two have totally different purposes and should
    not be mixed.

    "long" means "at least 32-bit". Nothing more, nothing less - and if you
    use it to mean more, you are using it incorrectly.

    Yes, it was probably poorly written too. But the language doesn't help.

    The language lets you write bad code. What language does not?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Tue Aug 31 15:38:26 2021
    On 31/08/2021 14:31, David Brown wrote:
    On 31/08/2021 13:24, Bart wrote:

    Have a look at this benchmark:

      https://github.com/sal55/langs/blob/master/bench.c

    (Notice the comment that the code was modified for 64 bits.)

    If I try and run this on 64-bit Windows, it crashes. gcc will kindly
    point out the reason: it uses both 'int' and 'long', and here assumes
    that long is wide enough (presumably wider than int) to store a pointer
    value.

    If you cast a pointer to a "long", you deserve all the problems you get.
    Why on earth would you thing that is a good idea?

    This is not my code. I made the minimum changes needed so that the
    benchmark gave meaningful results, since I don't understand how it works.


    I can fix it by changing 'long' to 'long long'. (Although for the
    purpose here, probably intptr_t would be more apt.)

    Fix it by using pointers as pointers, and don't faff around with casting
    them into random integer types! If you desperately need to hold
    different types of pointers in the same variable (though it is often an indication of poor program structure), use "void *".

    But you've got code there that casts pointers into longs, and then
    compares them to an int...

    I suspect that made more sense in the original BCPL (see below), which I
    think only had a single type, a machine word.

    That was a very simple model appropriate to the word-addressed machines
    at the time, but could now could be once again useful with 64-bit machines.

    Look also at Knuth's MMIX assembler for a machine where everything is 64
    bits too; registers can hold 64-bit ints or 64-bit floats or 64-bit
    addresses.

    You might argue that it is Windows' or MS' fault for not using a more
    sensible width of 'long', but the fact is that C ALLOWS all these
    disparate versions of types such as 'long'.


    No, the fault here is the programmer's alone.

    This program used both 'int' and 'long' types; why? What assumptions
    where there about certain properties of long that weren't present for int?

    You can't pin this one on MS, because Linux32 has them the same size
    too. (Although if the intention was for long to hold an address, that
    would work here.)


    There /are/ cases where it makes sense to turn pointers into integers, because you want to do weird things such as compact storage in large
    arrays, or low-level stuff like checking alignments. The type
    "uintptr_t" has existed for that purpose for 20 years. The C
    preprocessor to check type sizes and make your own local equivalent to "uintptr_t" has existed for 50 years. There has /never/ been an excuse
    to cast a pointer to an int, or a long, or other guessed type and think
    it is a good idea.

    You are not alone in writing bad code like this, but that does not
    excuse you.

    (If you think the C is poor, look at the orginal BCPL code, somewhere in
    the links here:

    https://www.cl.cam.ac.uk/~mr10/Bench.html)



    It might be the same width as 'int'; it might be double the width; it
    might be anything so long as it's not smaller than 'int'.

    So there is a clear advantage these days in an 'int' type that grows
    with the word size and is the same size as an address. Although this
    benchmark might have then gone wrong in assuming that 'int' was 32 bits.


    No, there is no sense in having the size of "int" related to the size of
    a pointer, because the two have totally different purposes and should
    not be mixed.

    It makes life simpler in many, many situations where both have to exist
    in the same location. This hasn't always been possible (eg. 8086 with
    16-bit ints and 32-bit pointers and floats); but it is now.

    For example, I use bytecode which is a 64-bit array of values, where any element can variously be:

    * A bytecode index (8 bits) ...
    * ... usually fixed up as a pointer to a function handler (64 bits)
    * An immediate 64-bit int operand
    * An immediate 64-bit float operand
    * A label address (64 bits)
    * A pointer to a symbol table entry
    * The address of a static variable
    * The offset of a local variable
    * A pointer to a string object

    You get the idea.


    "long" means "at least 32-bit". Nothing more, nothing less - and if you
    use it to mean more, you are using it incorrectly.

    Yes, it was probably poorly written too. But the language doesn't help.

    The language lets you write bad code. What language does not?

    C makes it easier in this case by not being strict with its type system.
    This example was from the 1990s, but you can still write such code now.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Tue Aug 31 17:32:29 2021
    On 31/08/2021 16:38, Bart wrote:
    On 31/08/2021 14:31, David Brown wrote:
    On 31/08/2021 13:24, Bart wrote:

    Have a look at this benchmark:

       https://github.com/sal55/langs/blob/master/bench.c

    (Notice the comment that the code was modified for 64 bits.)

    If I try and run this on 64-bit Windows, it crashes. gcc will kindly
    point out the reason: it uses both 'int' and 'long', and here assumes
    that long is wide enough (presumably wider than int) to store a pointer
    value.

    If you cast a pointer to a "long", you deserve all the problems you get.
      Why on earth would you thing that is a good idea?

    This is not my code. I made the minimum changes needed so that the
    benchmark gave meaningful results, since I don't understand how it works.


    Fair enough.


    I can fix it by changing 'long' to 'long long'. (Although for the
    purpose here, probably intptr_t would be more apt.)

    Fix it by using pointers as pointers, and don't faff around with casting
    them into random integer types!  If you desperately need to hold
    different types of pointers in the same variable (though it is often an
    indication of poor program structure), use "void *".

    But you've got code there that casts pointers into longs, and then
    compares them to an int...

    I suspect that made more sense in the original BCPL (see below), which I think only had a single type, a machine word.


    Yes, that would be the case for BCPL.

    That was a very simple model appropriate to the word-addressed machines
    at the time, but could now could be once again useful with 64-bit machines.

    Look also at Knuth's MMIX assembler for a machine where everything is 64
    bits too; registers can hold 64-bit ints or 64-bit floats or 64-bit addresses.

    I have great respect for Knuth and his work. But his MMIX is a hideous
    design even for that time, and a serious hinder to understanding his algorithms. (There's nothing wrong with 64-bit registers that can hold integers, floats or addresses - that bit's fine. IMHO, of course.)


    You might argue that it is Windows' or MS' fault for not using a more
    sensible width of 'long', but the fact is that C ALLOWS all these
    disparate versions of types such as 'long'.


    No, the fault here is the programmer's alone.

    This program used both 'int' and 'long' types; why? What assumptions
    where there about certain properties of long that weren't present for int?


    The program is assuming that "long" is big enough to hold addresses.
    And it also at one point compares an int against a pointer converted to
    a long. Something is screwed there. (I haven't tried to follow the
    program logic enough to see what.)

    You can't pin this one on MS, because Linux32 has them the same size
    too. (Although if the intention was for long to hold an address, that
    would work here.)


    No, I pin it on the programmer.


    There /are/ cases where it makes sense to turn pointers into integers,
    because you want to do weird things such as compact storage in large
    arrays, or low-level stuff like checking alignments.  The type
    "uintptr_t" has existed for that purpose for 20 years.  The C
    preprocessor to check type sizes and make your own local equivalent to
    "uintptr_t" has existed for 50 years.  There has /never/ been an excuse
    to cast a pointer to an int, or a long, or other guessed type and think
    it is a good idea.

    You are not alone in writing bad code like this, but that does not
    excuse you.

    (If you think the C is poor, look at the orginal BCPL code, somewhere in
    the links here:

    https://www.cl.cam.ac.uk/~mr10/Bench.html)


    Thanks, but no thanks :-) I am happy to have missed out on BCPL.

    Usually translating code from one program language into another is as successful as translating poetry from one human language to another.
    Unless you are willing to re-write significant parts in a manner
    appropriate to the new language, it is never going to be good.




    It might be the same width as 'int'; it might be double the width; it
    might be anything so long as it's not smaller than 'int'.

    So there is a clear advantage these days in an 'int' type that grows
    with the word size and is the same size as an address. Although this
    benchmark might have then gone wrong in assuming that 'int' was 32 bits. >>>

    No, there is no sense in having the size of "int" related to the size of
    a pointer, because the two have totally different purposes and should
    not be mixed.

    It makes life simpler in many, many situations where both have to exist
    in the same location.

    What situations? When do you need to interpret a pointer as an integer,
    or an integer as a pointer? (You can share memory space between them
    with a union - that's a different thing.)

    This hasn't always been possible (eg. 8086 with
    16-bit ints and 32-bit pointers and floats); but it is now.

    For example, I use bytecode which is a 64-bit array of values, where any element can variously be:

      * A bytecode index (8 bits) ...
      * ... usually fixed up as a pointer to a function handler (64 bits)
      * An immediate 64-bit int operand
      * An immediate 64-bit float operand
      * A label address (64 bits)
      * A pointer to a symbol table entry
      * The address of a static variable
      * The offset of a local variable
      * A pointer to a string object

    You get the idea.


    Yes - it's a union.

    There is no requirement or even benefit in having pointers and integers
    the same size.


    "long" means "at least 32-bit".  Nothing more, nothing less - and if you
    use it to mean more, you are using it incorrectly.

    Yes, it was probably poorly written too. But the language doesn't help.

    The language lets you write bad code.  What language does not?

    C makes it easier in this case by not being strict with its type system.
    This example was from the 1990s, but you can still write such code now.


    C is a lot stricter with types than many people seem to think. Casting
    is a way of writing "I know this is probably a bad idea, and it is
    breaking the type system, but do it anyway". C views pointers and
    integer types as completely separate, but gives you a way to work around
    those rules.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Tue Aug 31 18:59:41 2021
    On 31/08/2021 18:45, James Harris wrote:
    On 30/08/2021 23:25, Bart wrote:


    Note that even C compilers for smaller devices such as Z80 can be
    rather specialist products.

    Any idea why C for Z80 should be a specialist product?


    Well it's one of the devices listed here:

    http://sdcc.sourceforge.net

    Possibly because it also comes in packages with integrated peripherals
    that may need language support.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Tue Aug 31 18:45:14 2021
    On 30/08/2021 23:25, Bart wrote:
    On 30/08/2021 21:09, James Harris wrote:
    On 30/08/2021 12:45, Bart wrote:

    ...

    So don't be too hard on C's type system. Most of it is
    there for a reason.

    OK. It all depends on how much you want to emulate C in that regard, and
    so inherit a lot of the same mess in its type system.

    ...

    But it seems that you are keen to cover everything in one language, from
    the tiniest microcontrollers, through desktop PCs and current
    supercomputers, and up to massively large machines that need 128 bits to specify file sizes.

    It seems a lot when you put it that way but AISI a program should
    express an algorithm and the compiler should action that algorithm on
    whatever CPU is the target.


    I'd say that's being a little ambitious.

    It may be. Though emitting code for different CPUs looks as though it is
    going to be one of the less difficult issues!


    (Are you still writing OSes too?)

    I have an OS project. It was largely dormant for years but I recently
    started to write code for it in my own language now that I have a
    primitive form of my language running. It was, in fact, to help with the
    OS that I started the language project in the first place - but it
    rather developed a life of its own!

    ...

    you've been able to
    avoid the complexity of all those issues by simply avoiding supporting such machines.

    It wasn't my job to support every possible machine.

    Sure. Not saying you should. It's just that what you see as wrong with
    C's type system may be partly due to it having to support so many
    diverse CPUs.


    Note that even C compilers for smaller devices such as Z80 can be rather specialist products.

    Any idea why C for Z80 should be a specialist product?


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Tue Aug 31 18:15:48 2021
    On 31/08/2021 16:32, David Brown wrote:
    On 31/08/2021 16:38, Bart wrote:

    It makes life simpler in many, many situations where both have to exist
    in the same location.

    What situations? When do you need to interpret a pointer as an integer,
    or an integer as a pointer? (You can share memory space between them
    with a union - that's a different thing.)

    A union is a clunkier way of doing the same way. But you can't as easily
    access the int/float/pointer representation.


    For example, I use bytecode which is a 64-bit array of values, where any
    element can variously be:

      * A bytecode index (8 bits) ...
      * ... usually fixed up as a pointer to a function handler (64 bits)
      * An immediate 64-bit int operand
      * An immediate 64-bit float operand
      * A label address (64 bits)
      * A pointer to a symbol table entry
      * The address of a static variable
      * The offset of a local variable
      * A pointer to a string object

    You get the idea.


    Yes - it's a union.

    There is no requirement or even benefit in having pointers and integers
    the same size.

    There is a benefit in this case. On a system with 32-bit pointers, if I
    still wanted 64-bit immediates, it would be too inefficient to have
    64-bit bytecode as vast majority of entries wouldn't need it.

    I'd have to use 32-bit bytecode with immediates spanning two operands,
    or indexed via indices into a table.

    But where everything is the same size anyway, it makes life easier with
    simpler decisions.

    C makes it easier in this case by not being strict with its type system.
    This example was from the 1990s, but you can still write such code now.


    C is a lot stricter with types than many people seem to think.

    Not in a useful way. I needed to convert that benchmark to my language,
    and spent a lot of time changing all those zeros used for NULL, into nil.

    And when it really counts, C lets you down. If I create this data
    structure in my language:

    ref[]ref int P

    (Pointer to array of pointer to int.) Then to access one of those ints,
    the full syntax (because the first ^ is optional) is:

    P^[i]^

    That is, deref, index, deref, exactly following the type, and in the
    same left-to-right order. In C, it ought to be something like this:

    int *((*P)[]);

    int a = *((*P)[i];

    That is, deref, index, deref like above. Except you can also do:

    int b = ***P; // deref deref deref
    int c = P[i][j][k]; // index index index
    int d = (*P[i])[j]; // index deref index

    What the hell happened to C's type strictness?! Why am I being allowed
    to write nonsense?

    I can write that access 8 different ways, but only one is correct (2**N combinations of index/deref ops).

    If I try any of those seven wrong ways in my language:

    P^^^
    P[i,j,k]
    p[i]^[j]

    it generates a type error each time.

    However this is straying into other crazy aspects of C than type
    denotations.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Tue Aug 31 19:35:00 2021
    On 31/08/2021 19:15, Bart wrote:
    On 31/08/2021 16:32, David Brown wrote:
    On 31/08/2021 16:38, Bart wrote:

    It makes life simpler in many, many situations where both have to exist
    in the same location.

    What situations?  When do you need to interpret a pointer as an integer,
    or an integer as a pointer?  (You can share memory space between them
    with a union - that's a different thing.)

    A union is a clunkier way of doing the same way. But you can't as easily access the int/float/pointer representation.

    What? A union is a simple, type-safe way of storing different data
    types in the same memory slot. And in C (but not C++), you can use it
    to view the representation of different types - whereas a cast might
    change it. (That can, in theory, happen when casting between pointers
    and integers - though it won't do so in practice on mainstream systems.
    It definitely happens when casting between floats and integers, however.)




    For example, I use bytecode which is a 64-bit array of values, where any >>> element can variously be:

       * A bytecode index (8 bits) ...
       * ... usually fixed up as a pointer to a function handler (64 bits) >>>    * An immediate 64-bit int operand
       * An immediate 64-bit float operand
       * A label address (64 bits)
       * A pointer to a symbol table entry
       * The address of a static variable
       * The offset of a local variable
       * A pointer to a string object

    You get the idea.


    Yes - it's a union.

    There is no requirement or even benefit in having pointers and integers
    the same size.

    There is a benefit in this case. On a system with 32-bit pointers, if I
    still wanted 64-bit immediates, it would be too inefficient to have
    64-bit bytecode as vast majority of entries wouldn't need it.

    I'd have to use 32-bit bytecode with immediates spanning two operands,
    or indexed via indices into a table.


    Unions are the appropriate solution here too. (You could also use
    memcpy, or copy directly using unsigned char pointers, but that would be inefficient if you don't have a good enough compiler.) Casting won't
    work when there are floats involved.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Tue Aug 31 19:03:22 2021
    On 31/08/2021 11:57, David Brown wrote:

    ...

    The point is that when you want a /number/ - an integer - then 64-bit is
    more than sufficient for most purposes while also not being too big to
    handle efficiently on most modern cpus. You could almost say that about 32-bit, but definitely say it now we have 64-bit. This is why no one
    has bothered making a 128-bit processor - there simply isn't any use for
    such large numbers as plain integers.

    In terms of the main registers 64 bits is fine, especially when they are
    used for memory addresses as well as integers. However, that's not the
    only option. ICL made a machine with a 128-bit accumulator as far back
    as the 1970s!

    Today, ZFS works with 128-bit numbers. So does MD5, and probably others.

    SHA-2 apparently works with sizes from 224 bits upwards.

    https://en.wikipedia.org/wiki/SHA-2



    I think that the language should allow any range. If the target does not
    support the particular range, the compiler should deploy a library
    implementation, giving a warning.


    Don't misunderstand me here - there are use-cases for all kinds of sizes
    and types. A language that intends to be efficient is going to have to
    be able to handle arrays of smaller types. And there has to be ways to handle bigger types for their occasional usage. The point is that there
    is little need for multiple sizes for "normal" everyday usage - a signed 64-bit type really will work for practically everything.

    These seem OK:

    int 64
    int 128
    uint 224
    uint 256
    etc

    If a language includes code to manipulate 64-bit integers on 16-bit
    hardware I cannot see a problem with it manipulating 256-bit integers on
    64-bit hardware.

    It's also surely better for a programmer to be able to write

    a + b

    instead of

    bignum128_add(a, b)


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Tue Aug 31 19:27:25 2021
    On 30/08/2021 19:06, David Brown wrote:

    ...

    There is also a quotation from Bjarne Stroustrup along the lines of
    every complicated language having a simple language at its core
    struggling to get out.

    You could tell him it's called C. :-)


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Tue Aug 31 19:30:36 2021
    On 31/08/2021 08:32, David Brown wrote:
    On 30/08/2021 22:50, James Harris wrote:

    ...

    Are you saying the C standards have an official definition of 'scalar'
    (which is along the lines you mention)?


    Yes. 6.2.5p21 in C17. There are official drafts of the C standards available freely (the actual published ISO standards cost money, but the final drafts are free and just as good). Google for C17 draft, document N2176, for example.

    That's really interesting. I didn't know that C was still under
    development.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Tue Aug 31 19:37:25 2021
    On 31/08/2021 19:03, James Harris wrote:
    On 31/08/2021 11:57, David Brown wrote:

    ...

    The point is that when you want a /number/ - an integer - then 64-bit is
    more than sufficient for most purposes while also not being too big to
    handle efficiently on most modern cpus.  You could almost say that about
    32-bit, but definitely say it now we have 64-bit.  This is why no one
    has bothered making a 128-bit processor - there simply isn't any use for
    such large numbers as plain integers.

    In terms of the main registers 64 bits is fine, especially when they are
    used for memory addresses as well as integers. However, that's not the
    only option. ICL made a machine with a 128-bit accumulator as far back
    as the 1970s!

    Today, ZFS works with 128-bit numbers. So does MD5, and probably others.

    SHA-2 apparently works with sizes from 224 bits upwards.

      https://en.wikipedia.org/wiki/SHA-2



    I think that the language should allow any range. If the target does not >>> support the particular range, the compiler should deploy a library
    implementation, giving a warning.


    Don't misunderstand me here - there are use-cases for all kinds of sizes
    and types.  A language that intends to be efficient is going to have to
    be able to handle arrays of smaller types.  And there has to be ways to
    handle bigger types for their occasional usage.  The point is that there
    is little need for multiple sizes for "normal" everyday usage - a signed
    64-bit type really will work for practically everything.

    These seem OK:

      int 64
      int 128
      uint 224
      uint 256
      etc

    If a language includes code to manipulate 64-bit integers on 16-bit
    hardware I cannot see a problem with it manipulating 256-bit integers on 64-bit hardware.

    Have you seen C emulation code for even 128/128, based on 64-bit ops?
    It's scary! It might be a bit shorter in ASM, but it can still be a lot
    of work.

    Below I've listed all the ops in my IL that need to deal with 64-bit
    integers (floats excluded), and which, if fully supported, need to also
    deal with i128/u128 types.

    At the moment, in my regular compiler, 128-bit support has many holes.
    Notably for divide; it only has i128/i64, which is needed to be able to
    print 128-bit values.

    BTW your u224 type is likely to need 64-bit alignment, so will probably
    occupy 256 bits anyway. So I can't see the point of that if you've also
    already got a 256-bit type.

    Remember support will be hard enough without dealing with odd sizes.

    --------------------------------
    Binary ops:

    add
    sub
    mul
    idiv
    irem
    iand (bitwise)
    ior
    ixor
    shl
    shr
    min
    max
    eq
    ne
    lt
    le
    ge
    gt
    andl (logical)
    orl
    in (a in b..c)

    Unary ops:

    neg
    abs
    inot
    notl
    istruel

    sqr (square)
    sign
    atans
    power

    Binary inplace ops:

    addto (a +:= b)
    subto
    multo
    idivto
    iremto
    iandto
    iorto
    ixorto
    shlto
    shrto
    minto
    maxto
    andlto
    orlto

    Unary inplace ops:

    negto (-:= a, which means a := -a)
    absto
    inotto
    notlto
    istruelto

    incrto (++a)
    decrto
    loadincr (a++ and use old value)
    loaddecr
    incrload (++a and use new value)
    decrload

    Conversions:

    fix (float to int)
    float (int to float)
    widen (from narrower int)

    Bit/Bitfield ops:

    getdotindex
    putdotindex
    getdotslice
    putdotslice

    Call/Return:

    pass by value
    return by value
    return multple-value

    I/O (likely to be external to IL; implemented in user code, not by IL
    backend):

    print/tostring in various bases
    read/fromstring


    If a language includes code to manipulate 64-bit integers on 16-bit
    hardware I cannot see a problem with it manipulating 256-bit integers on 64-bit hardware.

    There are going to be more reasons to emulate 64 bits than 256 bits:

    * To be able to run, even inefficiently, the same algorithms that run on
    a native 64-bit machine

    * To be able to work with everyday quantities such as file or drive
    sizes, or the world population, ....

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Tue Aug 31 19:41:20 2021
    On 31/08/2021 19:30, James Harris wrote:
    On 31/08/2021 08:32, David Brown wrote:
    On 30/08/2021 22:50, James Harris wrote:

    ...

    Are you saying the C standards have an official definition of 'scalar'
    (which is along the lines you mention)?


    Yes.  6.2.5p21 in C17.  There are official drafts of the C standards
    available freely (the actual published ISO standards cost money, but the
    final drafts are free and just as good).  Google for C17 draft, document
    N2176, for example.

    That's really interesting. I didn't know that C was still under
    development.

    What they're planning to add actually probably isn't very interesting
    (it never is).

    For example being able to do this in a function definition (in C2x):

    void fn(int a, int, int c) {
    }

    Can you spot what it is?

    Yes, you can leave out a parameter name without it being an error. (So
    that if you do it inadvertently, you might silently end up using some
    global of that name instead!)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Tue Aug 31 21:52:25 2021
    On 2021-08-31 20:37, Bart wrote:

    Have you seen C emulation code for even 128/128, based on 64-bit ops?
    It's scary! It might be a bit shorter in ASM, but it can still be a lot
    of work.

    It is easier with 32-bit because of handling the carry (since you have
    no access to it in a high-level language) and because multiplication is simpler. But in general nothing scary, all algorithms are well known.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Tue Aug 31 22:10:51 2021
    On 31/08/2021 20:52, Dmitry A. Kazakov wrote:
    On 2021-08-31 20:37, Bart wrote:

    Have you seen C emulation code for even 128/128, based on 64-bit ops?
    It's scary! It might be a bit shorter in ASM, but it can still be a
    lot of work.

    It is easier with 32-bit because of handling the carry (since you have
    no access to it in a high-level language) and because multiplication is simpler. But in general nothing scary, all algorithms are well known.


    Have a look at divmod128by128() here:

    https://github.com/sal55/langs/blob/master/muldiv128.c

    That in turn calls divmod128by64, shiftleft128, shiftright128,
    div128by64, inc128, dec128, mult128, mult64to128, sub128 [not shown],
    nlz128 [not showb] and compare128.

    It's not scary at all!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Wed Sep 1 10:13:16 2021
    On 31/08/2021 20:41, Bart wrote:
    On 31/08/2021 19:30, James Harris wrote:
    On 31/08/2021 08:32, David Brown wrote:
    On 30/08/2021 22:50, James Harris wrote:

    ...

    Are you saying the C standards have an official definition of 'scalar' >>>> (which is along the lines you mention)?


    Yes.  6.2.5p21 in C17.  There are official drafts of the C standards
    available freely (the actual published ISO standards cost money, but the >>> final drafts are free and just as good).  Google for C17 draft, document >>> N2176, for example.

    That's really interesting. I didn't know that C was still under
    development.

    What they're planning to add actually probably isn't very interesting
    (it never is).

    That's intentional. C has been mostly stable since C99 - and stability
    and backwards compatibility are two of its most important features as a language. Changes are only made if they are really important, tried and
    tested (in mainstream compilers and/or other languages, primarily C++),
    give something that can't easily be achieved in other ways, and are not
    going to risk conflicts with existing code.

    So C11 added multi-threading semantics and atomics, along with
    standardising a few common extensions found on many compilers, and
    _Generic (to give a standard way of implementing C99 <tgmath.h> without compiler-specific extensions). C17 was mostly just bug-fixes and minor clarifications. C2x is expected to tidy a few obsolescent points such
    as removing non-prototype function declarations (and making "int foo();"
    the same as "int foo(void);", just like C++). Signed integer format
    will be fixed at two's complement (but overflow is still undefined
    behaviour, for which I am glad). These are all small things, but nice.


    For example being able to do this in a function definition (in C2x):

      void fn(int a, int, int c) {
      }

    Can you spot what it is?

    Yes, you can leave out a parameter name without it being an error. (So
    that if you do it inadvertently, you might silently end up using some
    global of that name instead!)


    Trust you to find potential problems in every feature of C.

    For other programmers, this is a good thing that some will choose to
    use. It means that if you have a function implementation that does not
    use a parameter, you can omit the name. This documents to readers that
    the parameter is not used. And it means you can have compiler warnings
    for unused parameters - if you have named a parameter and have not used
    it in the function, the compiler can warn you that you've probably
    forgotten something.

    It is a feature that is more useful in C++ than C, since it plays well
    with function overloading, and its purpose here is partly motivated by
    removing an unnecessary mismatch with C++. But like anonymous unions,
    it is just saying that you don't need to name things you won't use.

    As for conflicts with global names - design and structure your code
    sensibly, use sensible names, and you won't have any problems. If you
    have a global variable called "b" and use "b" for parameter names, you
    have already screwed up regardless of this new feature.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Wed Sep 1 11:46:11 2021
    On 01/09/2021 09:13, David Brown wrote:
    On 31/08/2021 20:41, Bart wrote:
    On 31/08/2021 19:30, James Harris wrote:
    On 31/08/2021 08:32, David Brown wrote:
    On 30/08/2021 22:50, James Harris wrote:

    ...

    Are you saying the C standards have an official definition of 'scalar' >>>>> (which is along the lines you mention)?


    Yes.  6.2.5p21 in C17.  There are official drafts of the C standards >>>> available freely (the actual published ISO standards cost money, but the >>>> final drafts are free and just as good).  Google for C17 draft, document >>>> N2176, for example.

    That's really interesting. I didn't know that C was still under
    development.

    What they're planning to add actually probably isn't very interesting
    (it never is).

    That's intentional. C has been mostly stable since C99 - and stability
    and backwards compatibility are two of its most important features as a language. Changes are only made if they are really important, tried and tested (in mainstream compilers and/or other languages, primarily C++),
    give something that can't easily be achieved in other ways, and are not
    going to risk conflicts with existing code.

    That one is pretty easy: if a parameter name isn't used, then don't use
    it: nothing else required! If you want to warn user and/or compiler that
    it is intentionally not used, then just give it a special name (leading underscores etc).

    Then new code will still be compatible with existing compilers, and
    tools that parse source code (bearing in mind it will take 20 years
    before the feature will be generally available).


    For example being able to do this in a function definition (in C2x):

      void fn(int a, int, int c) {
      }

    Can you spot what it is?

    Yes, you can leave out a parameter name without it being an error. (So
    that if you do it inadvertently, you might silently end up using some
    global of that name instead!)


    Trust you to find potential problems in every feature of C.

    Inadvertent clashes with identically-named globals is an issue with a
    few languages, including C, and all of mine.

    You don't want to exacerbate the problem.

    (My new script language is particularly susceptible, since it allows
    'open' statements, not enclosed in a function as well as out of order definitions. That means any variable used there, even 'i' in a loop
    statement, are automatically global.

    Since functions don't need to declare vaiables, if one uses 'i' too, it
    will clash with the global 'i'.

    I don't want to introduce new ways of dealing with scope, since my
    languages have worked that way for decades. But here I use a guideline
    (also enforced in the compiler but currently disabled) of not mixing
    open statements with actual functions; I need to put them inside start()
    or main() which is automatically called.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Wed Sep 1 18:13:15 2021
    On 31/08/2021 19:37, Bart wrote:
    On 31/08/2021 19:03, James Harris wrote:


    ...


    If a language includes code to manipulate 64-bit integers on 16-bit
    hardware I cannot see a problem with it manipulating 256-bit integers
    on 64-bit hardware.

    Have you seen C emulation code for even 128/128, based on 64-bit ops?
    It's scary! It might be a bit shorter in ASM, but it can still be a lot
    of work.

    Below I've listed all the ops in my IL that need to deal with 64-bit
    integers (floats excluded), and which, if fully supported, need to also
    deal with i128/u128 types.

    I've got some C for unsigned wide unsigned operations. I think it's my
    own code from a number of years ago (2015). Division is a bit hairy -
    but none of the code is scary!

    It all looks quite logical. It covers

    shifts, gets, sets, add, sub, mul, div, rem

    The bitwise operations and comparisons you mention would be trivial
    because the data structure is just an array

    ui16 value[4];

    though I wrap it in a struct so the compiler will pass it by value. I
    think the code was written to give me 64-bit numbers in a 16-bit
    environment. That's why the array length is 4.

    And note that it's 4, not 2. Theoretically, the code would work for ANY
    size of integer as long as it fits into a whole number of words. Add
    some overflow handling and signed integers (which would not be so easy)
    and I may have the arbitrary-length arithmetic you keep telling me is so difficult to do!


    At the moment, in my regular compiler, 128-bit support has many holes. Notably for divide; it only has i128/i64, which is needed to be able to
    print 128-bit values.

    BTW your u224 type is likely to need 64-bit alignment, so will probably occupy 256 bits anyway. So I can't see the point of that if you've also already got a 256-bit type.

    That's OK. I mentioned u224 because SHA-2 uses such a size. IOW there's
    a real-world requirement for such a value. Another reason to provide
    integers of arbitrary size - especially if it costs almost nothing.


    Remember support will be hard enough without dealing with odd sizes.

    Why would support be hard?

    In terms of storage I expect I'd round up the size at least to a whole
    number of words and treat incursion into upper bits in the same way as overflow.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Wed Sep 1 20:22:19 2021
    On 01/09/2021 18:13, James Harris wrote:
    On 31/08/2021 19:37, Bart wrote:
    On 31/08/2021 19:03, James Harris wrote:


    ...


    If a language includes code to manipulate 64-bit integers on 16-bit
    hardware I cannot see a problem with it manipulating 256-bit integers
    on 64-bit hardware.

    Have you seen C emulation code for even 128/128, based on 64-bit ops?
    It's scary! It might be a bit shorter in ASM, but it can still be a
    lot of work.

    Below I've listed all the ops in my IL that need to deal with 64-bit
    integers (floats excluded), and which, if fully supported, need to
    also deal with i128/u128 types.

    I've got some C for unsigned wide unsigned operations. I think it's my
    own code from a number of years ago (2015). Division is a bit hairy -
    but none of the code is scary!

    It all looks quite logical. It covers

      shifts, gets, sets, add, sub, mul, div, rem

    The bitwise operations and comparisons you mention would be trivial
    because the data structure is just an array

      ui16 value[4];

    Ops like and, or, xor, equals, not equals are fairly easy because you
    just apply them linearly along the array. (Equality needs the results of
    each element combined.)

    Ops such as add and subtract, relative compare, multiply etc all start
    getting a bit more elaborate because the elements get less indepedent
    from each other.


    though I wrap it in a struct so the compiler will pass it by value. I
    think the code was written to give me 64-bit numbers in a 16-bit
    environment. That's why the array length is 4.

    And note that it's 4, not 2. Theoretically, the code would work for ANY
    size of integer as long as it fits into a whole number of words. Add
    some overflow handling and signed integers (which would not be so easy)
    and I may have the arbitrary-length arithmetic you keep telling me is so difficult to do!

    Arbitrary precision is different from fixed precision with dedicated
    routines or inline code for each width of number.

    It's not necessarily hard, just less efficient, if they are truely
    arbitrary where one input to the routines is N, the width of the number,
    where it involves loops, and where the numeric inputs and outputs are
    likely to be to and from memory.

    You wouldn't want to use such routines where N is the same width as the
    machine word size; or where N is narrrower; nor really where N is only
    twice the machine word width. Here you want inputs to be in registers
    and ouputs too.

    But it'll be OK for super-wide numbers of several times or many times
    the machine word size.


    At the moment, in my regular compiler, 128-bit support has many holes.
    Notably for divide; it only has i128/i64, which is needed to be able
    to print 128-bit values.

    BTW your u224 type is likely to need 64-bit alignment, so will
    probably occupy 256 bits anyway. So I can't see the point of that if
    you've also already got a 256-bit type.

    That's OK. I mentioned u224 because SHA-2 uses such a size. IOW there's
    a real-world requirement for such a value. Another reason to provide
    integers of arbitrary size - especially if it costs almost nothing.


    Remember support will be hard enough without dealing with odd sizes.

    Why would support be hard?

    In terms of storage I expect I'd round up the size at least to a whole
    number of words and treat incursion into upper bits in the same way as overflow.


    So how would it work even with a simple example of 53 bits. You round up
    the storage to 64 bits, so it might as well be i64. Arithmetic would
    likely also be done using the machine's 64-bit operations.

    But you now also need:

    * To truncate the results to 53 bits, or check those 13 bits don't
    indicate overflow (as the processor flags won't tell you)

    * When loading signed 53 bits, presumably as two's complement, your sign
    will be in bit 52, and bits 53 to 63 need to be properly signed-extended
    if that pattern is not already stored in memory.

    * You to take extra care with right shifts: you want copies of bit 52
    shifted to bit 51 etc, but the machine's SAR instruction will assume the
    sign bit is bit 63.

    * If you want wraparound overflow, this this need to be specially
    programmed.

    I feel I'm just scratching the surface here. I have enough trouble just
    with the following 3 categories:

    * Short (8/16/32 bits)
    * Normal (64 bits)
    * Wide (128 bits)

    With signed/unsigned for each type as another factor.

    Most arithmetic will be done on Normal/Wide (64 or 128 bits); they are
    only done on Short types for in-place arithmetic: A[i]+:=x where A is a
    byte array for example.

    Imagine if I also had to deal with 9-15, 17-31, 33-62, and 65-127 bits!

    (With my own routines for arbitrary precision, where I control that
    precision, then it is in terms of 'limbs', which is one element (a word
    etc) of the number, or the ui16 of your example; it is not in terms of
    decimal or binary digits.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Wed Sep 1 21:54:41 2021
    On 2021-09-01 21:22, Bart wrote:

    [...]

    I feel I'm just scratching the surface here.

    Yep, when you ignore mathematics, it bites you in the butt...

    I you paid respect to it, you would notice that modular numbers never
    overflow. If the result is outside the range 0..K-1, you subtract (or
    add) K until it is in.

    For integers there is no problem either. As with modular numbers you do
    all computations using a wider machine range. Only when you have to
    assign the result to a variable or pass it to a subprogram you check if
    it is in the range. That is all.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Wed Sep 1 21:50:32 2021
    On 01/09/2021 20:54, Dmitry A. Kazakov wrote:
    On 2021-09-01 21:22, Bart wrote:

    [...]

    I feel I'm just scratching the surface here.

    Yep, when you ignore mathematics, it bites you in the butt...

    I you paid respect to it, you would notice that modular numbers never overflow. If the result is outside the range 0..K-1, you subtract (or
    add) K until it is in.

    For integers there is no problem either. As with modular numbers you do
    all computations using a wider machine range. Only when you have to
    assign the result to a variable or pass it to a subprogram you check if
    it is in the range. That is all.

    It's not that simple. Say you have variables of 8 bits each, but your
    language says that all calculations will be done at 64 bits.

    So what happens when you calculate A << B >> C; do you do everything at
    64 bits and then only truncate to 8 bits when storing in some 8-bit destination?

    Or do you have to do that truncation after every binary operand? In
    which case, you have to bring it back up to a valid 64-bit value for the
    next operation (eg. sign extend a possibly 13-bit result).

    Those decisions on how intermediate values are handled can give
    different results. And can make it slower.

    What about multiplying a 13-bit value by a 17-bit one; what should the intermediate result be truncated to?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Thu Sep 2 09:30:47 2021
    On 2021-09-01 22:50, Bart wrote:
    On 01/09/2021 20:54, Dmitry A. Kazakov wrote:
    On 2021-09-01 21:22, Bart wrote:

    [...]

    I feel I'm just scratching the surface here.

    Yep, when you ignore mathematics, it bites you in the butt...

    I you paid respect to it, you would notice that modular numbers never
    overflow. If the result is outside the range 0..K-1, you subtract (or
    add) K until it is in.

    For integers there is no problem either. As with modular numbers you
    do all computations using a wider machine range. Only when you have to
    assign the result to a variable or pass it to a subprogram you check
    if it is in the range. That is all.

    It's not that simple. Say you have variables of 8 bits each, but your language says that all calculations will be done at 64 bits.

    No, it says that the result of expression is OK if mathematically correct.

    So what happens when you calculate A << B >> C; do you do everything at
    64 bits and then only truncate to 8 bits when storing in some 8-bit destination?

    Shifts are not arithmetic operations, if you meant shifts. They are not
    defined on either modular or integers. E.g. how do you shift mod 7
    number? You can only define a numeric equivalent of bit shift for 2**N
    modular numbers.

    Or do you have to do that truncation after every binary operand?

    Yes, you implement bit shift as defined, where is a problem?

    In
    which case, you have to bring it back up to a valid 64-bit value for the
    next operation (eg. sign extend a possibly 13-bit result).

    What for?

    What about multiplying a 13-bit value by a 17-bit one; what should the intermediate result be truncated to?

    You cannot, it is a type error.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Thu Sep 2 12:14:23 2021
    On 02/09/2021 08:30, Dmitry A. Kazakov wrote:
    On 2021-09-01 22:50, Bart wrote:
    On 01/09/2021 20:54, Dmitry A. Kazakov wrote:
    On 2021-09-01 21:22, Bart wrote:

    [...]

    I feel I'm just scratching the surface here.

    Yep, when you ignore mathematics, it bites you in the butt...

    I you paid respect to it, you would notice that modular numbers never
    overflow. If the result is outside the range 0..K-1, you subtract (or
    add) K until it is in.

    For integers there is no problem either. As with modular numbers you
    do all computations using a wider machine range. Only when you have
    to assign the result to a variable or pass it to a subprogram you
    check if it is in the range. That is all.

    It's not that simple. Say you have variables of 8 bits each, but your
    language says that all calculations will be done at 64 bits.

    No, it says that the result of expression is OK if mathematically correct.

    So what happens when you calculate A << B >> C; do you do everything
    at 64 bits and then only truncate to 8 bits when storing in some 8-bit
    destination?

    Shifts are not arithmetic operations, if you meant shifts.

    If you don't like a<<b>>c, try a*b/c where "/" is integer divide.

    Or assume that a<<b means a*2**b and a>>b means a/b**2.

    Or take as the example any operation resulting in higher order bits that
    should be discarded, affecting results of later ones.

    * They are not
    defined on either modular or integers. E.g. how do you shift mod 7
    number? You can only define a numeric equivalent of bit shift for 2**N modular numbers.

    Or do you have to do that truncation after every binary operand?

    Yes, you implement bit shift as defined, where is a problem?

    In which case, you have to bring it back up to a valid 64-bit value
    for the next operation (eg. sign extend a possibly 13-bit result).

    What for?

    You're really not interested in how real machines work, are you?

    If the real machine only has 64 bits operations, and you are working
    with 17-bit values, then you need something meaningful in those top 47 bits.

    If you are working with ranges that aren't an exact number of bits, eg.
    the range 0 to 9999, appox 14-15 bits, then you still need a full 64-bit
    value to be presented to the processor.

    This is all a lot of mucking about that is needed compared with using
    ranges of values that a processor can directly deal with, such as 0 to
    65535. Or compared with using an i64 range only.

    What about multiplying a 13-bit value by a 17-bit one; what should the
    intermediate result be truncated to?

    You cannot, it is a type error.

    Why? What actually is wrong here:

    record R =
    u64 dummy : (a:13, b:17)
    end
    R x := (u64.max,) # all ones

    println x.a
    println x.b
    println x.a * x.b

    Output is:

    8191
    131071
    1073602561

    Or a simpler version not using stored types:

    u64 x := x.max

    println x.[0..12]
    println x.[13..29]
    println x.[0..12] * x.[13..29]

    Same output. Here the language extends all values to 64 bits as needed
    (bit fields are unsigned), and intermediate values stay at a full 64 bits.

    If stored into a narrower field, they get truncated. The language can
    choose to do a range check in that case; mine doesn't: it is assumed to
    be an explicit truncation.

    I was just asking the questions where, if odd-sized variables are being
    used, what the detailed semantics were.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Thu Sep 2 13:44:22 2021
    On 2021-09-02 13:14, Bart wrote:

    Or take as the example any operation resulting in higher order bits that should be discarded, affecting results of later ones.

    Again, the result of expression is either mathematically correct or you
    get an exception. What is unclear here? Take the formula. Compute it on
    paper using rules of mathematics. That is you answer.

    In which case, you have to bring it back up to a valid 64-bit value
    for the next operation (eg. sign extend a possibly 13-bit result).

    What for?

    You're really not interested in how real machines work, are you?

    So, you are interested how Gigabyte PSUs burn? They do brightly with a
    lot of noise. Quite entertaining. Next?

    If the real machine only has 64 bits operations, and you are working
    with 17-bit values, then you need something meaningful in those top 47
    bits.

    I do not know what a 17-bit value is, but there is absolutely no problem
    to implement integer and modular arithmetic on any existing machine.

    If you are working with ranges that aren't an exact number of bits, eg.
    the range 0 to 9999, appox 14-15 bits, then you still need a full 64-bit value to be presented to the processor.

    Whatever machine number capable to represent all range.

    This is all a lot of mucking about that is needed compared with using
    ranges of values that a processor can directly deal with, such as 0 to
    65535. Or compared with using an i64 range only.

    Wrong. Constraints need to be checked only upon assignment and parameter passing. There is no such thing as 17-bit value, there is a constraint
    put on an countably infinite set of integer values. Arithmetic
    operations and comparisons are defined on all range. No
    constraint-specific operations exist! Anything mathematically sound can
    be done on a wider range. Anything else is rubbish.

    What about multiplying a 13-bit value by a 17-bit one; what should
    the intermediate result be truncated to?

    You cannot, it is a type error.

    Why?

    Because it is two different types, obviously.

    All problems you describe are imaginary. Once the semantics is defined
    by attributing correct types, no ambiguity exists.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Thu Sep 2 13:33:56 2021
    On 01/09/2021 21:50, Bart wrote:

    ...

    It's not that simple. Say you have variables of 8 bits each, but your language says that all calculations will be done at 64 bits.

    So what happens when you calculate A << B >> C; do you do everything at
    64 bits and then only truncate to 8 bits when storing in some 8-bit destination?

    You need rules. Mine include that in the absence of a size override

    A << B

    produces a result of the size of the LHS. Ditto the right shift.


    Or do you have to do that truncation after every binary operand? In
    which case, you have to bring it back up to a valid 64-bit value for the
    next operation (eg. sign extend a possibly 13-bit result).

    Those decisions on how intermediate values are handled can give
    different results. And can make it slower.

    The language having rules should prohibit getting different results.


    What about multiplying a 13-bit value by a 17-bit one; what should the intermediate result be truncated to?


    I have two schemes in mind for that and have not decided, yet, which to
    use. Under scheme 1 the rule is that the narrower value would
    automatically be widened to match the wider one before the multiplication.

    Under scheme 2 the numbers would be multiplied in their native forms. (I
    don't know off hand if that would produce a different result.)

    If the widths are N and M then under either scheme the /normal/ result
    would have width

    max(N, M)

    I don't want to complicate things but for completeness, I do allow for a
    result to have width

    N * M

    and that exceptional width of result would be specified in the syntax.
    The normal width would still be max(N, M) and so in your example the
    result would have 17 bits.

    Other rules would do. The above are just the ones I chose.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Thu Sep 2 14:14:02 2021
    On 02/09/2021 13:21, James Harris wrote:
    On 01/09/2021 20:22, Bart wrote:

    Maybe that's 'dynamic' precision. I see types like

      int 17
      int 81

    as having a width which is arbitrary but fixed.


    It's not necessarily hard, just less efficient,

    Well, compare the HLL programmer having to create routines to handle
    large or unusually sized integers with having the compiler do it.

    Surely the bottom line is that the compiler is best placed to produce
    the most efficient code, especially as it will have knowledge of both
    what's required and the target architecture.

    If I'm looking at struct layout and one of the elements has type ui65,
    then I would find that worrying. What's going on in that layout? If the programmer is specifying a type so exactly, then perhaps they are also concerned about the precise layout!

    Same thing with an array of ui63. Would a million-element array occupy 8,000,000 bytes or 7,875,000?

    I think if such types were specified with a value range, say 0 ..
    2**65-1 or 0..2**63-1 [I should have used smaller examples!], or ones
    not using powers of 2, that would suggest a programmer who doesn't care
    about bit-representation or exact layouts so much, compared with someone
    who writes ui65 or ui63.

    They might be put out if they end up with ui128/ui72, or ui64.

    I suppose what I'm saying is that when you specify an exact bit-width,
    you should get that. Which means you then have to implement that array
    with 63-bit elements.


    Again, with the sign already extended that should not be a problem.

    You need to be sure that the previous operation has the sign (on the msb
    of the machine word) and intermediate bits matching the sign bit of your narrower field.

    * If you want wraparound overflow, this this need to be specially
    programmed.

    I'm not sure what that means. If the number's limit is at a power of 2
    then what I think you might mean be wraparound should happen automatically.

    Only if you truncate. If using a u8 type within a u64 register, then 255
    + 1 will yield 256, not 0.

    For many subsequent ops, that doesn't matter. But if the next op shifts
    right by 1 bit, you'll end up with 80 rather than 0.

    At least to start with I think I'd try to limit the number of categories
    but in a different way to you, e.g.

      * Integers of a size which the machine is built to deal with
      * Integers which are a whole number (2 or more) of such units
      * Integers which have odd bits

    The ones the machine can deal with - e.g. 8, 16, 32, 64 - would be
    handled by normal routines.

    The ones which needed 2 or more such units could be handled by long arithmetic.

    The others would additionally need tweaking for the extra bits.

    I think that's a reasonable approach - put all the tricky ones to one side!

    And with the middle category, once you've implemented that one, then you
    will probably want to streamline the ops that are only twice natural
    word size.

    Because, if doing a simple op like & (bitwise and) for example, you
    don't really want to run a loop over two elements either inline or in a
    called routine.

    (But maybe your implementation already has specialised branches for
    that, which can be recognised at compile-time.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Thu Sep 2 13:50:03 2021
    On 02/09/2021 12:44, Dmitry A. Kazakov wrote:
    On 2021-09-02 13:14, Bart wrote:

    Or take as the example any operation resulting in higher order bits
    that should be discarded, affecting results of later ones.

    Again, the result of expression is either mathematically correct or you
    get an exception. What is unclear here? Take the formula. Compute it on
    paper using rules of mathematics. That is you answer.

    In which case, you have to bring it back up to a valid 64-bit value
    for the next operation (eg. sign extend a possibly 13-bit result).

    What for?

    You're really not interested in how real machines work, are you?

    So, you are interested how Gigabyte PSUs burn? They do brightly with a
    lot of noise. Quite entertaining. Next?

    If the real machine only has 64 bits operations, and you are working
    with 17-bit values, then you need something meaningful in those top 47
    bits.

    I do not know what a 17-bit value is,

    Let's say it's a value in the range representable by a u17 types, that
    is 0 to 131071.

    If you're working with a conventional processor, then the first problem
    you have is that storage is defined in whole multiples of 8-bit bytes.

    The second is that load and store ops are in terms of 8, 16, 32, 64 or sometimes 128 bit units. An auxiliary one is that such ops may have
    alignment needs.

    The third is that the registers available for integer arithmetic are of
    8, 16, 32 and 64 bits.

    A fourth might be that arithmetic operations may not work directly on
    all of those (but on x64 they do).

    Notice how that 17-bit type really doesn't fit naturally into that model.

    So I don't have such sizes as 'first-class' types. Bitfields are
    generally dealt with extracting them from a more normal type, or
    inserting them.

    but there is absolutely no problem
    to implement integer and modular arithmetic on any existing machine.

    If you are working with ranges that aren't an exact number of bits,
    eg. the range 0 to 9999, appox 14-15 bits, then you still need a full
    64-bit value to be presented to the processor.

    Whatever machine number capable to represent all range.

    So, 16 bits for that range, or 64-bits if your language semantics say
    that all intermediate calculations must be that size. (Most C
    implementations use 32-bit calculations.)

    If you don't have such rules, then something like this becomes ambiguous:

    byte a := 255
    print a + 1

    Does this display 0, 256, or report an runtime error?


    Why?

    Because it is two different types, obviously.

    All problems you describe are imaginary. Once the semantics is defined
    by attributing correct types, no ambiguity exists.

    I think you're talking nonsense.

    Most people can add a 3-digit decimal number like 721 to a 4-digit one
    like 9485 to yield the 5-digit 10179, without being concerned about the
    being different types!

    They /could/ represent different quantities, but here they are pure numbers.

    Numbers representing different quantites could well have different
    types, which a language could help out with (eg. to stop you adding £721
    and 9485 grams), but in this case, they are simply width-restricted.

    721 could be 3 digits because it's not expected to store more than 999,
    so it saves storage.

    But the language has to say so. James' proposals are not clear on that
    matter.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Thu Sep 2 13:21:59 2021
    On 01/09/2021 20:22, Bart wrote:
    On 01/09/2021 18:13, James Harris wrote:
    On 31/08/2021 19:37, Bart wrote:
    On 31/08/2021 19:03, James Harris wrote:

    ...

       ui16 value[4];

    ...

    And note that it's 4, not 2. Theoretically, the code would work for
    ANY size of integer as long as it fits into a whole number of words.
    Add some overflow handling and signed integers (which would not be so
    easy) and I may have the arbitrary-length arithmetic you keep telling
    me is so difficult to do!

    Arbitrary precision is different from fixed precision with dedicated
    routines or inline code for each width of number.

    Maybe that's 'dynamic' precision. I see types like

    int 17
    int 81

    as having a width which is arbitrary but fixed.


    It's not necessarily hard, just less efficient,

    Well, compare the HLL programmer having to create routines to handle
    large or unusually sized integers with having the compiler do it.

    Surely the bottom line is that the compiler is best placed to produce
    the most efficient code, especially as it will have knowledge of both
    what's required and the target architecture.

    ...


    Remember support will be hard enough without dealing with odd sizes.

    Why would support be hard?

    In terms of storage I expect I'd round up the size at least to a whole
    number of words and treat incursion into upper bits in the same way as
    overflow.


    So how would it work even with a simple example of 53 bits. You round up
    the storage to 64 bits, so it might as well be i64. Arithmetic would
    likely also be done using the machine's 64-bit operations.

    But you now also need:

    * To truncate the results to 53 bits, or check those 13 bits don't
    indicate overflow (as the processor flags won't tell you)

    Thinking as I write, instead of 53 say we had 6-bit numbers and our CPU
    could only process 4 bits at a time. Signed ints would be of the
    following form.

    10_0000 = -32
    11_1111 = -1
    00_0000 = 0
    01_1111 = 31

    But in storage or in registers they would have two more bits on the
    left. It might make arithmetic simpler and faster if those bits were the
    same as the sign so the numbers would be

    1110_0000 = -32
    1111_1111 = -1
    0000_0000 = 0
    0001_1111 = 31

    With that format wouldn't additions and subtractions which overflowed
    the designated 6-bit width of the integer also trip the CPU's overflow
    flag?


    * When loading signed 53 bits, presumably as two's complement, your sign
    will be in bit 52, and bits 53 to 63 need to be properly signed-extended
    if that pattern is not already stored in memory.

    As above, that may be best.


    * You to take extra care with right shifts: you want copies of bit 52
    shifted to bit 51 etc, but the machine's SAR instruction will assume the
    sign bit is bit 63.

    Again, with the sign already extended that should not be a problem.


    * If you want wraparound overflow, this this need to be specially
    programmed.

    I'm not sure what that means. If the number's limit is at a power of 2
    then what I think you might mean be wraparound should happen automatically.


    I feel I'm just scratching the surface here. I have enough trouble just
    with the following 3 categories:

      * Short (8/16/32 bits)
      * Normal (64 bits)
      * Wide (128 bits)

    At least to start with I think I'd try to limit the number of categories
    but in a different way to you, e.g.

    * Integers of a size which the machine is built to deal with
    * Integers which are a whole number (2 or more) of such units
    * Integers which have odd bits

    The ones the machine can deal with - e.g. 8, 16, 32, 64 - would be
    handled by normal routines.

    The ones which needed 2 or more such units could be handled by long
    arithmetic.

    The others would additionally need tweaking for the extra bits.


    With signed/unsigned for each type as another factor.

    Most arithmetic will be done on Normal/Wide (64 or 128 bits); they are
    only done on Short types for in-place arithmetic:  A[i]+:=x where A is a byte array for example.

    Imagine if I also had to deal with 9-15, 17-31, 33-62, and 65-127 bits!

    With the three categories above any size of integer should be usable -
    21-bit, 36-bit, 512-bit, etc.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Thu Sep 2 14:44:26 2021
    On 02/09/2021 14:14, Bart wrote:
    On 02/09/2021 13:21, James Harris wrote:
    On 01/09/2021 20:22, Bart wrote:

    ...

    Surely the bottom line is that the compiler is best placed to produce
    the most efficient code, especially as it will have knowledge of both
    what's required and the target architecture.

    If I'm looking at struct layout and one of the elements has type ui65,
    then I would find that worrying. What's going on in that layout? If the programmer is specifying a type so exactly, then perhaps they are also concerned about the precise layout!

    Same thing with an array of ui63. Would a million-element array occupy 8,000,000 bytes or 7,875,000?

    I think if such types were specified with a value range, say 0 ..
    2**65-1 or 0..2**63-1 [I should have used smaller examples!], or ones
    not using powers of 2, that would suggest a programmer who doesn't care
    about bit-representation or exact layouts so much, compared with someone
    who writes ui65 or ui63.

    They might be put out if they end up with ui128/ui72, or ui64.

    I suppose what I'm saying is that when you specify an exact bit-width,
    you should get that. Which means you then have to implement that array
    with 63-bit elements.

    I'd want ui63 and ui65 to /behave/ as though they had the specified
    number of bits. I wouldn't be too bothered about how they were stored.
    That could be left to the optimiser. If it was optimising for speed it
    would probably align them in cells which were at least a whole number of machine words. If optimising for data space, by contrast, it might abut
    them. But the behaviour would be the same in either case.

    Your point about struct layouts is a good one. For some time I've been
    toying with the idea that the concept of a struct should be broken into
    two:

    1. a 'bag' of fields - stored in any order and with any padding

    2. a layout of storage in exact order and with padding explicit

    I am getting less and less sure that one concept should do both.



    Again, with the sign already extended that should not be a problem.

    You need to be sure that the previous operation has the sign (on the msb
    of the machine word) and intermediate bits matching the sign bit of your narrower field.

    For signed, yes. Just thinking about how to implement that, one option is

    if sign bit is 1
    OR the top bits
    else
    AND NOT the top bits

    but that involves a branch. Better on some CPUs may be

    shift left to move sign bit to MSB
    signed shift right the same number of bits

    Two fast instructions is not bad, and such an operation could be omitted
    where the compiler could be sure that the bits were correct or they
    didn't matter.


    * If you want wraparound overflow, this this need to be specially
    programmed.

    I'm not sure what that means. If the number's limit is at a power of 2
    then what I think you might mean be wraparound should happen
    automatically.

    Only if you truncate. If using a u8 type within a u64 register, then 255
    + 1 will yield 256, not 0.

    With my rules, if a program had

    uint 8 a = 255
    uint 9 b = 255

    then

    a + 1 would be 0
    b + 1 would be 256

    though I am undecided, as yet, as to whether the first should trigger an 'unsigned overflow' exception or not. (That may be best decided by the programmer with some suitable declaration or syntax.)


    For many subsequent ops, that doesn't matter. But if the next op shifts
    right by 1 bit, you'll end up with 80 rather than 0.

    Indeed. In the example,

    (a + 1) >> 1 would be 0
    (b + 1) >> 1 would be hex 80


    At least to start with I think I'd try to limit the number of
    categories but in a different way to you, e.g.

       * Integers of a size which the machine is built to deal with
       * Integers which are a whole number (2 or more) of such units
       * Integers which have odd bits

    The ones the machine can deal with - e.g. 8, 16, 32, 64 - would be
    handled by normal routines.

    The ones which needed 2 or more such units could be handled by long
    arithmetic.

    The others would additionally need tweaking for the extra bits.

    I think that's a reasonable approach - put all the tricky ones to one side!

    Yes, all the tricky ones should be handled by a single set of routines.
    So when you asked about 9..15, 17..31 etc one set of routines should
    deal with them all.


    And with the middle category, once you've implemented that one, then you
    will probably want to streamline the ops that are only twice natural
    word size.

    Because, if doing a simple op like & (bitwise and) for example, you
    don't really want to run a loop over two elements either inline or in a called routine.

    (But maybe your implementation already has specialised branches for
    that, which can be recognised at compile-time.)


    No, on the contrary. I am not at that point, yet, but when I get there I suspect I may generate the required operations and let the optimiser
    decide how to apply them.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Thu Sep 2 16:52:11 2021
    On 2021-09-02 14:50, Bart wrote:
    On 02/09/2021 12:44, Dmitry A. Kazakov wrote:

    I do not know what a 17-bit value is,

    Let's say it's a value in the range representable by a u17 types, that
    is 0 to 131071.

    If you're working with a conventional processor, then the first problem
    you have is that storage is defined in whole multiples of 8-bit bytes.

    Why is that a problem? Define u17.

    The second is that load and store ops are in terms of 8, 16, 32, 64 or sometimes 128 bit units. An auxiliary one is that such ops may have
    alignment needs.

    Why is that a problem again?

    The third is that the registers available for integer arithmetic are of
    8, 16, 32 and 64 bits.

    Ditto.

    A fourth might be that arithmetic operations may not work directly on
    all of those (but on x64 they do).

    And?

    Notice how that 17-bit type really doesn't fit naturally into that model.

    What model? Again, what is the problem with implementing a mod 2**17
    type, if u17 means that?

    So I don't have such sizes as 'first-class' types. Bitfields are
    generally dealt with extracting them from a more normal type, or
    inserting them.

    Bit fields have nothing to do with any integer types.

    Whatever machine number capable to represent all range.

    So, 16 bits for that range, or 64-bits if your language semantics say
    that all intermediate calculations must be that size.

    Nope, it says, I repeat: the result must be either mathematically
    correct or raise an exception.

    (Most C implementations use 32-bit calculations.)

    I don't care what C does.

    If you don't have such rules, then something like this becomes ambiguous:

       byte a := 255 >    print a + 1

    Does this display 0, 256, or report an runtime error?

    I don't know what is byte here. You must annotate the exact semantics.
    Provided print takes byte as the argument, then

    1. If byte is an integer range -256..255, the result is an exception.

    2. If byte it mod 256, the result is 0

    3. If byte is a memory unit octet, the result is type error.

    All problems you describe are imaginary. Once the semantics is defined
    by attributing correct types, no ambiguity exists.

    I think you're talking nonsense.

    Most people can add a 3-digit decimal number like 721 to a 4-digit one
    like 9485 to yield the 5-digit 10179, without being concerned about the
    being different types!

    I do not know what "3-digit" means when you attach it to "decimal
    number." To me it means an integer type with the range of values
    -10**4+1 .. 10**4-1 and decimal (packed?) machine representation. That
    is definitely different from 4-digit decimal number. So the type error
    goes if you mix them.

    They /could/ represent different quantities, but here they are pure
    numbers.

    Pure numbers exist only in books on mathematics.

    Numbers representing different quantites could well have different
    types, which a language could help out with (eg. to stop you adding £721
    and 9485 grams), but in this case, they are simply width-restricted.

    I have no idea what "width-restricted" means. Define types. Annotate all
    values with types, then we can proceed.

    721 could be 3 digits because it's not expected to store more than 999,
    so it saves storage.

    721 is a number. Its decimal representation has 3 digits. The rest is
    rubbish.

    But the language has to say so. James' proposals are not clear on that matter.

    Has to say what? Again, describe the type. The type representation in
    the storage is a part of type implementation, unless explicitly stated
    by the programmer like with decimal numbers.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Thu Sep 2 17:01:04 2021
    On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
    On 2021-09-02 14:50, Bart wrote:
    On 02/09/2021 12:44, Dmitry A. Kazakov wrote:

    I do not know what a 17-bit value is,

    Let's say it's a value in the range representable by a u17 types, that
    is 0 to 131071.

    If you're working with a conventional processor, then the first
    problem you have is that storage is defined in whole multiples of
    8-bit bytes.

    Why is that a problem? Define u17.

    The second is that load and store ops are in terms of 8, 16, 32, 64 or
    sometimes 128 bit units. An auxiliary one is that such ops may have
    alignment needs.

    Why is that a problem again?

    The third is that the registers available for integer arithmetic are
    of 8, 16, 32 and 64 bits.

    Ditto.

    A fourth might be that arithmetic operations may not work directly on
    all of those (but on x64 they do).

    And?

    OK. So according to you these types are no problem at all. You can use a
    u17 type just as easily as a 16-bit or 32-bit type.

    Perhaps you'd like to show some actual assembly code then this fragment:

    u17 a,b,c
    a := b + c

    I'd be particularly interested in how a,b,c are laid out in memory.

    Notice how that 17-bit type really doesn't fit naturally into that model.

    What model? Again, what is the problem with implementing a mod 2**17
    type, if u17 means that?

    So I don't have such sizes as 'first-class' types. Bitfields are
    generally dealt with extracting them from a more normal type, or
    inserting them.

    Bit fields have nothing to do with any integer types.

    It sounds they have nothing to do with anything according to you.

    If you don't have such rules, then something like this becomes ambiguous:

        byte a := 255 >     print a + 1

    Does this display 0, 256, or report an runtime error?

    I don't know what is byte here. You must annotate the exact semantics.

    Provided print takes byte as the argument, then

    1. If byte is an integer range -256..255, the result is an exception.

    2. If byte it mod 256, the result is 0

    3. If byte is a memory unit octet, the result is type error.

    In my language, the printed value is 256. Here's why:

    * 'byte' is a u8 storage type occupying one byte of storage that
    can store values from 0 to 255

    * When extracted from memory, it is widened to u64 (64 bits)

    * The '1' is a literal that normally has type i64, but a literal when
    combined with an unsigned operand is unsigned, will assume type u64

    * The "+" will add the u64 values 255 and 1 to get a u64 result 256

    Those are /my/ rules, which are probably the least surprising.

    Perhaps what's confusing you (assuming you're not being deliberately contradictory), is my use of 'type' to distinguish integers with
    different storage widths.

    If I use u8, u16 or u32 instead of the natural u64, it is purely to save
    space and increase efficiency, if I know the stored values will fit into
    that space.


    All problems you describe are imaginary. Once the semantics is
    defined by attributing correct types, no ambiguity exists.

    I think you're talking nonsense.

    Most people can add a 3-digit decimal number like 721 to a 4-digit one
    like 9485 to yield the 5-digit 10179, without being concerned about
    the being different types!

    I do not know what "3-digit" means when you attach it to "decimal
    number."


    It's a number with 3 digits, in context where you can't have more than 3 digits. Imagine you're filling in a paper form, or entry field, which
    limits how many digits can be entered.

    Clearly you can perform arithmetic on the values in such fields, but you
    can't record a result in a field which is too small.

    Pure numbers exist only in books on mathematics.

    Numbers representing different quantites could well have different
    types, which a language could help out with (eg. to stop you adding
    £721 and 9485 grams), but in this case, they are simply width-restricted.

    I have no idea what "width-restricted" means. Define types. Annotate all values with types,

    No, width here does not define a new type as you like to think of types.

    But I'm tired of arguing.

    Perhaps go and implement some actual LOWER-LEVEL language on an actual processor then we can talk again. Perhaps you will have idea of what a
    bitfield is for by then.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Thu Sep 2 17:47:02 2021
    On 02/09/2021 17:01, Bart wrote:
    On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
    On 2021-09-02 14:50, Bart wrote:

    ...

    A fourth might be that arithmetic operations may not work directly on
    all of those (but on x64 they do).

    And?

    OK. So according to you these types are no problem at all. You can use a
    u17 type just as easily as a 16-bit or 32-bit type.

    Perhaps you'd like to show some actual assembly code then this fragment:

       u17 a,b,c
       a := b + c

    I'd be particularly interested in how a,b,c are laid out in memory.

    I'd be interested to see Dmitry's assembly code - but I suspect he'll
    not answer that part.

    What is supposed to happen on overflow and are there any particular optimisation goals?



    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Thu Sep 2 19:02:28 2021
    On 2021-09-02 18:01, Bart wrote:
    On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
    On 2021-09-02 14:50, Bart wrote:
    On 02/09/2021 12:44, Dmitry A. Kazakov wrote:

    I do not know what a 17-bit value is,

    Let's say it's a value in the range representable by a u17 types,
    that is 0 to 131071.

    If you're working with a conventional processor, then the first
    problem you have is that storage is defined in whole multiples of
    8-bit bytes.

    Why is that a problem? Define u17.

    The second is that load and store ops are in terms of 8, 16, 32, 64
    or sometimes 128 bit units. An auxiliary one is that such ops may
    have alignment needs.

    Why is that a problem again?

    The third is that the registers available for integer arithmetic are
    of 8, 16, 32 and 64 bits.

    Ditto.

    A fourth might be that arithmetic operations may not work directly on
    all of those (but on x64 they do).

    And?

    OK. So according to you these types are no problem at all. You can use a
    u17 type just as easily as a 16-bit or 32-bit type.

    Right.

    Perhaps you'd like to show some actual assembly code then this fragment:

       u17 a,b,c
       a := b + c

    You do not know how to load registers, make addition, comparison,
    subtraction?

    I'd be particularly interested in how a,b,c are laid out in memory.

    The way the compiler chooses, since there is no representation
    constraints. If the programmer wanted a certain representation the
    language could provide means to specify it. Ada does.

    Bit fields have nothing to do with any integer types.

    It sounds they have nothing to do with anything according to you.

    If you hear that everything is integer then, yes, NOT everything is nothing.

    If you don't have such rules, then something like this becomes
    ambiguous:

        byte a := 255 >     print a + 1

    Does this display 0, 256, or report an runtime error?

    I don't know what is byte here. You must annotate the exact semantics.

    Provided print takes byte as the argument, then

    1. If byte is an integer range -256..255, the result is an exception.

    2. If byte it mod 256, the result is 0

    3. If byte is a memory unit octet, the result is type error.

    In my language, the printed value is 256.

    I have no doubt that your language is broken.

    I do not know what "3-digit" means when you attach it to "decimal
    number."

    It's a number with 3 digits,

    Numbers have no digits. The representations of in some numeral system
    may have.

    https://en.wikipedia.org/wiki/Numeral_system

    I have no idea what "width-restricted" means. Define types. Annotate
    all values with types,

    No, width here does not define a new type as you like to think of types.

    Then what are you talking about?

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Thu Sep 2 19:12:38 2021
    On 2021-09-02 18:47, James Harris wrote:
    On 02/09/2021 17:01, Bart wrote:
    On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
    On 2021-09-02 14:50, Bart wrote:

    ...

    A fourth might be that arithmetic operations may not work directly
    on all of those (but on x64 they do).

    And?

    OK. So according to you these types are no problem at all. You can use
    a u17 type just as easily as a 16-bit or 32-bit type.

    Perhaps you'd like to show some actual assembly code then this fragment:

        u17 a,b,c
        a := b + c

    I'd be particularly interested in how a,b,c are laid out in memory.

    I'd be interested to see Dmitry's assembly code - but I suspect he'll
    not answer that part.

    Why do you want it?

    What is supposed to happen on overflow

    What overflow? It is a modular number, they do never overflow. It is up
    to you to implement arithmetic correctly using appropriate instructions.

    and are there any particular
    optimisation goals?

    When doing modular arithmetic you must minimize checks by proving that
    the intermediates are correct regardless the arguments. Say, you decided
    to implement arithmetic using 32-bit machine numbers. Then with b+c you
    have nothing to worry about. You load b and c into 32-bit registers, you
    sum them. Then you verify if the result is greater than 2**17-1, if yes,
    you subtract 2**17. Difficult?

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Thu Sep 2 20:56:27 2021
    On 02/09/2021 18:47, James Harris wrote:
    On 02/09/2021 17:01, Bart wrote:
    On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
    On 2021-09-02 14:50, Bart wrote:

    ...

    A fourth might be that arithmetic operations may not work directly
    on all of those (but on x64 they do).

    And?

    OK. So according to you these types are no problem at all. You can use
    a u17 type just as easily as a 16-bit or 32-bit type.

    Perhaps you'd like to show some actual assembly code then this fragment:

        u17 a,b,c
        a := b + c

    I'd be particularly interested in how a,b,c are laid out in memory.

    I'd be interested to see Dmitry's assembly code - but I suspect he'll
    not answer that part.

    Presumably it is roughly the same as you'd get in C with :

    uint32_t a, b, c;

    a = (b + c) % (1u << 17);

    Types in a high level language are not constructs in assembly. They
    don't have to correspond to matching hardware or assembly-level
    features. Just as a "bool" or an enumerated type in C is going to be
    stored in a register or memory in exactly the same way as a the
    processor might store a number, so a "u17" type (assuming that means a 0
    .. 2 ^ 17 - 1 modulo type) will be stored in the same way as some
    integer. That's likely to be the same storage as a uint32_t. (Though
    one processor I use has 20-bit registers, which would be more efficient
    than using two of its 16-bit registers.)

    It is the language semantics that defined the types and their
    operations. Then it is up to the compiler - not the high level
    programmer - to figure out how to turn that into efficient assembly.


    What is supposed to happen on overflow and are there any particular optimisation goals?


    There are no overflows on a modular type. (That is why Dmitry is keen
    on being precise in his definitions of types - something I approve of.)
    Optimisation goals are the same as always - the compiler should
    generate object code that gives the observable effects required by the
    source language, assuming that the program follows whatever rules the
    language expects.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Thu Sep 2 21:26:50 2021
    On 02/09/2021 19:56, David Brown wrote:
    On 02/09/2021 18:47, James Harris wrote:
    On 02/09/2021 17:01, Bart wrote:
    On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
    On 2021-09-02 14:50, Bart wrote:

    ...

    A fourth might be that arithmetic operations may not work directly
    on all of those (but on x64 they do).

    And?

    OK. So according to you these types are no problem at all. You can use
    a u17 type just as easily as a 16-bit or 32-bit type.

    Perhaps you'd like to show some actual assembly code then this fragment: >>>
        u17 a,b,c
        a := b + c

    I'd be particularly interested in how a,b,c are laid out in memory.

    I'd be interested to see Dmitry's assembly code - but I suspect he'll
    not answer that part.

    Presumably it is roughly the same as you'd get in C with :

    uint32_t a, b, c;

    a = (b + c) % (1u << 17);

    Types in a high level language are not constructs in assembly.

    With the 17-bit example there were decisions to be made regarding how
    and where standalone 17-bit variables are stored in memory; whether each
    starts on a byte boundary or in the middle, and how exactly they will be
    loaded into registers.

    Intermediate values will be need to be looked at too. In my example for
    a single "+" it probably doesn't matter that the calculation will be
    done as 32-bits or 64-bit; but in some expressions it will. It also
    depends on the proposed semantics.

    My point had been that the resulting code would be very different
    compared with something like a 16-bit type which is an easy match for
    the hardware.

    This will affect the design of a language that may include such odd types.

    (My view is that standalone types of odd bit-widths are not worth adding
    as independent types. But I support bitfields within a containing type
    of a regular 8/16/32/64-bit width.

    There, loading and storing those fields is simpler. And intermediate calculations are done just like the regular types.

    I think James' view is different...)


    They
    don't have to correspond to matching hardware or assembly-level
    features. Just as a "bool" or an enumerated type in C is going to be
    stored in a register or memory in exactly the same way as a the
    processor might store a number, so a "u17" type (assuming that means a 0
    .. 2 ^ 17 - 1 modulo type) will be stored in the same way as some
    integer. That's likely to be the same storage as a uint32_t. (Though
    one processor I use has 20-bit registers, which would be more efficient
    than using two of its 16-bit registers.)

    What about a uint21 type then? There will always be odd types that don't
    match! Unless a processor has a completely flexible ALU with any
    bitwidth suported.

    It is the language semantics that defined the types and their
    operations. Then it is up to the compiler - not the high level
    programmer - to figure out how to turn that into efficient assembly.


    What is supposed to happen on overflow and are there any particular
    optimisation goals?


    There are no overflows on a modular type.

    /He/ introduced modular types. I call them merely unsigned types which
    /can/ overflow, and usually do so by wrapping.

    In any case, I understood modular types could have any range at all, for example with values from 50 to 100 inclusive. Then, ensuring that
    overflowing the limit of 100 wraps back to 50 can be fiddly without
    hardware assistance.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Thu Sep 2 22:45:15 2021
    On 02/09/2021 21:26, Bart wrote:
    On 02/09/2021 19:56, David Brown wrote:
    On 02/09/2021 18:47, James Harris wrote:
    On 02/09/2021 17:01, Bart wrote:

    ...

    Perhaps you'd like to show some actual assembly code then this
    fragment:

         u17 a,b,c
         a := b + c

    I'd be particularly interested in how a,b,c are laid out in memory.

    I'd be interested to see Dmitry's assembly code - but I suspect he'll
    not answer that part.

    Presumably it is roughly the same as you'd get in C with :

        uint32_t a, b, c;

        a = (b + c) % (1u << 17);

    Types in a high level language are not constructs in assembly.

    With the 17-bit example there were decisions to be made regarding how
    and where standalone 17-bit variables are stored in memory; whether each starts on a byte boundary or in the middle, and how exactly they will be loaded into registers.

    If there are no particular optimisation goals and you want addition to
    wrap then here's a simple answer to your query about assembly code. It
    has just one extra instruction compared with the similarly unoptimised
    code for 32-bit.

    mov eax, [b]
    add eax, [c]
    and eax, 0x0001_FFFF
    mov [a], eax

    section .data
    a: dd 0
    b: dd 0
    c: dd 0

    But that instruction could be omitted; see below about range declarations.

    ...

    They
    don't have to correspond to matching hardware or assembly-level
    features.  Just as a "bool" or an enumerated type in C is going to be
    stored in a register or memory in exactly the same way as a the
    processor might store a number, so a "u17" type (assuming that means a 0
    .. 2 ^ 17 - 1 modulo type) will be stored in the same way as some
    integer.  That's likely to be the same storage as a uint32_t.  (Though
    one processor I use has 20-bit registers, which would be more efficient
    than using two of its 16-bit registers.)

    What about a uint21 type then? There will always be odd types that don't match! Unless a processor has a completely flexible ALU with any
    bitwidth suported.

    I am not sure if you are thinking about it in this way but AISI the
    reason for choosing specific widths is to get repeatable results.
    Compare these declarations

    uint 17 a, b, c
    uint 17.. d, e, f

    The ".." means "or above" so on a 32-bit CPU you could imagine the
    declarations as becoming

    uint 17 a, b, c
    uint 32 d, e, f

    As such,

    a = b + c

    could lead to the assembly code, above, whereas

    d = e + f

    could lead to the same code but without the AND instruction because the variables themselves would be 32-bit (the register width), not 17-bit.
    (The range 17.. allows any number of bits from 17 upward, and you could
    imagine that 32 would be chosen as it's the target CPU's register width.)

    IOW the programmer would have the choice of saying: "these variables can
    be any width from X upwards" or "these variables must be exact width W".
    Each option has pros and cons but IMO it's right to let the programmer
    choose what he wants for the job in hand.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Thu Sep 2 23:44:00 2021
    On 2021-09-02 22:26, Bart wrote:

    With the 17-bit example there were decisions to be made regarding how
    and where standalone 17-bit variables are stored in memory;

    You are confusing storing values in memory with implementation of
    operations on them. You can store the value packed and use 1024-bit
    arithmetic to implement it. These are semantically independent choices.

    whether each
    starts on a byte boundary or in the middle, and how exactly they will be loaded into registers.

    The best way possible for the target machine unless the programmer tells otherwise.

    The following is a legal Ada program:

    with Ada.Text_IO;
    procedure Test is
    type u17_1 is mod 2**17; -- Use defaults
    type u17_2 is mod 2**17 with Size => 17; -- 17-bit, if matters
    type Packed is record
    H : Boolean;
    I : u17_2; -- Here it matters, because the record is packed
    J : Boolean;
    end record with Pack => True;
    X : u17_1 := 123;
    Y : u17_2 := 123;
    Z : Packed := (False, 123, True);
    begin
    Ada.Text_IO.Put_Line ("X'Size=" & Integer'Image (X'Size));
    Ada.Text_IO.Put_Line ("Y'Size=" & Integer'Image (Y'Size));
    Ada.Text_IO.Put_Line ("Z.I'Size=" & Integer'Image (Z.I'Size));
    end Test;

    It will print

    32
    32
    17

    The compiler uses the best possible machine representation of 32-bit
    except the case when I instructed it to use exactly 17 bits.

    My point had been that the resulting code would be very different
    compared with something like a 16-bit type which is an easy match for
    the hardware.

    Why should anybody care?

    This will affect the design of a language that may include such odd types.

    Why should it affect anything?

    They
    don't have to correspond to matching hardware or assembly-level
    features.  Just as a "bool" or an enumerated type in C is going to be
    stored in a register or memory in exactly the same way as a the
    processor might store a number, so a "u17" type (assuming that means a 0
    .. 2 ^ 17 - 1 modulo type) will be stored in the same way as some
    integer.  That's likely to be the same storage as a uint32_t.  (Though
    one processor I use has 20-bit registers, which would be more efficient
    than using two of its 16-bit registers.)

    What about a uint21 type then? There will always be odd types that don't match! Unless a processor has a completely flexible ALU with any
    bitwidth suported.

    Did you read what David wrote? It starts with: "They don't have to
    correspond to matching hardware or assembly-level features."

    What is supposed to happen on overflow and are there any particular
    optimisation goals?

    There are no overflows on a modular type.

    /He/ introduced modular types. I call them merely unsigned types which
    /can/ overflow, and usually do so by wrapping.

    https://mathworld.wolfram.com/ModularArithmetic.html https://en.wikipedia.org/wiki/Modular_arithmetic

    In any case, I understood modular types could have any range at all, for example with values from 50 to 100 inclusive. Then, ensuring that
    overflowing the limit of 100 wraps back to 50 can be fiddly without
    hardware assistance.

    Modulo 101 wraps to zero.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Thu Sep 2 23:08:08 2021
    On 02/09/2021 18:02, Dmitry A. Kazakov wrote:
    On 2021-09-02 18:01, Bart wrote:
    On 02/09/2021 15:52, Dmitry A. Kazakov wrote:

    And?

    OK. So according to you these types are no problem at all. You can use
    a u17 type just as easily as a 16-bit or 32-bit type.

    Right.

    OK.... But since you haven't shown any real code for a real processor
    based on some detailed proposal of the semantics, I don't believe you.


    If you don't have such rules, then something like this becomes
    ambiguous:

        byte a := 255 >     print a + 1

    Does this display 0, 256, or report an runtime error?

    I don't know what is byte here. You must annotate the exact semantics.

    Provided print takes byte as the argument, then

    1. If byte is an integer range -256..255, the result is an exception.

    2. If byte it mod 256, the result is 0

    3. If byte is a memory unit octet, the result is type error.

    In my language, the printed value is 256.

    I have no doubt that your language is broken.

    Well I make the rules for it so how can it be broken?

    It uses a primary integer type of i64 so adding 255 and 1 will give you 256.

    Storing an i64 value into a variable which is narrower than needed will truncate it. Loading it from a narrower variable will widen it.


    I do not know what "3-digit" means when you attach it to "decimal
    number."

    It's a number with 3 digits,

    Numbers have no digits.

    Mine do. By 'decimal' I meant a number represented as a sequence of
    digits such as most of the world is familiar with. You see them
    everywhere and many /are/ constrained by width for myriad practical
    reasons. It doesn't stop you doing arithmetic with those values.

    For example, the trip counter on my car might be limited to 3 digits; if
    I reset it daily, nothing stops me summing each day's totals.

    But ... why am I forced to explain this stuff, what are you, 7?

    Since I have to /implement/ languages, and have been doing so for
    decades, I continously have to make practical decisions on aspects of
    design.

    I guess you don't like the decisions I made; I guess I wouldn't like
    your idea of what a language should be either!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Thu Sep 2 23:48:43 2021
    On 02/09/2021 22:44, Dmitry A. Kazakov wrote:
    On 2021-09-02 22:26, Bart wrote:

    With the 17-bit example there were decisions to be made regarding how
    and where standalone 17-bit variables are stored in memory;

    You are confusing storing values in memory with implementation of
    operations on them. You can store the value packed and use 1024-bit arithmetic to implement it. These are semantically independent choices.

    whether each starts on a byte boundary or in the middle, and how
    exactly they will be loaded into registers.

    The best way possible for the target machine unless the programmer tells otherwise.

    The following is a legal Ada program:

       with Ada.Text_IO;
       procedure Test is
          type u17_1 is mod 2**17;                 -- Use defaults
          type u17_2 is mod 2**17 with Size => 17; -- 17-bit, if matters
          type Packed is record
             H : Boolean;
             I : u17_2; -- Here it matters, because the record is packed
             J : Boolean;
          end record with Pack => True;
          X : u17_1 := 123;
          Y : u17_2 := 123;
          Z : Packed := (False, 123, True);
       begin
          Ada.Text_IO.Put_Line ("X'Size=" & Integer'Image (X'Size));
          Ada.Text_IO.Put_Line ("Y'Size=" & Integer'Image (Y'Size));
          Ada.Text_IO.Put_Line ("Z.I'Size=" & Integer'Image (Z.I'Size));
       end Test;

    It will print

       32
       32
       17

    The compiler uses the best possible machine representation of 32-bit
    except the case when I instructed it to use exactly 17 bits.

    It's good that Ada has that much support at the bit level.

    However I have to write my own compilers, which tend not to be optimised
    so constructs that are naturally efficient are favoured. I can't spend
    10s of man years on such a project (which I wouldn't manage anyway).

    So I stand by my decision not to support down to the bit level within my
    type system**. I instead provide bit-related features at the operations
    level, which will do most things needed and will do them transparently.

    (Except in my dynamic language which has 1, 2 and 4-bit types for use in
    arrays only and as pointer targets.)


    My point had been that the resulting code would be very different
    compared with something like a 16-bit type which is an easy match for
    the hardware.

    Why should anybody care?

    This will affect the design of a language that may include such odd
    types.

    Why should it affect anything?

    Well, do you want to spend months implementing it, or years? If you want
    to complete the project, you might need to trim some features!


    They
    don't have to correspond to matching hardware or assembly-level
    features.  Just as a "bool" or an enumerated type in C is going to be
    stored in a register or memory in exactly the same way as a the
    processor might store a number, so a "u17" type (assuming that means a 0 >>> .. 2 ^ 17 - 1 modulo type) will be stored in the same way as some
    integer.  That's likely to be the same storage as a uint32_t.  (Though >>> one processor I use has 20-bit registers, which would be more efficient
    than using two of its 16-bit registers.)

    What about a uint21 type then? There will always be odd types that
    don't match! Unless a processor has a completely flexible ALU with any
    bitwidth suported.

    Did you read what David wrote? It starts with: "They don't have to
    correspond to matching hardware or assembly-level features."

    He suggested that a 20-bit register would be a good match for my 17-bit example. So I changed my example to 21 bits.

    What is supposed to happen on overflow and are there any particular
    optimisation goals?

    There are no overflows on a modular type.

    /He/ introduced modular types. I call them merely unsigned types which
    /can/ overflow, and usually do so by wrapping.

    https://mathworld.wolfram.com/ModularArithmetic.html https://en.wikipedia.org/wiki/Modular_arithmetic

    In any case, I understood modular types could have any range at all,
    for example with values from 50 to 100 inclusive. Then, ensuring that
    overflowing the limit of 100 wraps back to 50 can be fiddly without
    hardware assistance.

    Modulo 101 wraps to zero.

    OK. But my type was specifically 50 to 100. But let's change it to 10050
    to 10100, and I want 10100+1 to wrap to 10050. Now you additionally have
    to consider whether such a type can be represented within 8 bits (or
    possibly 6), and whether it will need 16 bits.

    My decision? Just use u16! And let user-code take care of any wrapping,
    like it has to for most things.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Fri Sep 3 09:10:43 2021
    On 2021-09-03 00:48, Bart wrote:

    OK. But my type was specifically 50 to 100. But let's change it to 10050
    to 10100, and I want 10100+1 to wrap to 10050. Now you additionally have
    to consider whether such a type can be represented within 8 bits (or
    possibly 6), and whether it will need 16 bits.

    My decision? Just use u16! And let user-code take care of any wrapping,
    like it has to for most things.

    This is a different topic. No language can provide types algebra
    covering all possible cases. This one is for ADT. I know, your language
    does not support ADT. But in a language like Ada, one would design an
    ADT with the desired properties. The client code would not care of
    anything, because wrapping would be done by the ADT operations.

    ADTs, type systems have just one goal, namely code reuse. When you say
    "client must" you lost. The provider must. In software developing we
    invest in design, because design is done and tested once while used
    thousands of times in the client code.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Fri Sep 3 09:23:56 2021
    On 2021-09-03 00:08, Bart wrote:
    On 02/09/2021 18:02, Dmitry A. Kazakov wrote:
    On 2021-09-02 18:01, Bart wrote:
    On 02/09/2021 15:52, Dmitry A. Kazakov wrote:

    And?

    OK. So according to you these types are no problem at all. You can
    use a u17 type just as easily as a 16-bit or 32-bit type.

    Right.

    OK.... But since you haven't shown any real code for a real processor
    based on some detailed proposal of the semantics, I don't believe you.

    with Ada.Text_IO; use Ada.Text_IO;
    procedure Test is
    type u17_2 is mod 2**17 with Size => 17;
    type Packed is record
    H : Boolean;
    I : u17_2;
    J : Boolean;
    end record with Pack => True;
    Z : Packed := (False, 123, True);
    begin
    Put_Line ("Z.I'Size=" & Integer'Image (Z.I'Size));
    Put_Line ("Increment=" & u17_2'Image (Z.I + 1));
    end Test;

    Prints:

    Z.I'Size= 17
    Increment= 124

    I leave disassembling to you as an exercise.

    If you don't have such rules, then something like this becomes
    ambiguous:

        byte a := 255 >     print a + 1

    Does this display 0, 256, or report an runtime error?

    I don't know what is byte here. You must annotate the exact semantics.

    Provided print takes byte as the argument, then

    1. If byte is an integer range -256..255, the result is an exception.

    2. If byte it mod 256, the result is 0

    3. If byte is a memory unit octet, the result is type error.

    In my language, the printed value is 256.

    I have no doubt that your language is broken.

    Well I make the rules for it so how can it be broken?

    Broke rules, broken language.

    I do not know what "3-digit" means when you attach it to "decimal
    number."

    It's a number with 3 digits,

    Numbers have no digits.

    Mine do.

    I admire your megalomania, but no, you do not own numbers.

    But ... why am I forced to explain this stuff, what are you, 7?

    Let me guess, you skipped lessons?

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Fri Sep 3 12:34:23 2021
    On 03/09/2021 08:23, Dmitry A. Kazakov wrote:
    On 2021-09-03 00:08, Bart wrote:
    On 02/09/2021 18:02, Dmitry A. Kazakov wrote:
    On 2021-09-02 18:01, Bart wrote:
    On 02/09/2021 15:52, Dmitry A. Kazakov wrote:

    And?

    OK. So according to you these types are no problem at all. You can
    use a u17 type just as easily as a 16-bit or 32-bit type.

    Right.

    OK.... But since you haven't shown any real code for a real processor
    based on some detailed proposal of the semantics, I don't believe you.

       with Ada.Text_IO; use Ada.Text_IO;
       procedure Test is
          type u17_2 is mod 2**17 with Size => 17;
          type Packed is record
             H : Boolean;
             I : u17_2;
             J : Boolean;
          end record with Pack => True;
          Z : Packed := (False, 123, True);
       begin
          Put_Line ("Z.I'Size=" & Integer'Image (Z.I'Size));
          Put_Line ("Increment=" & u17_2'Image (Z.I + 1));
       end Test;

    Prints:

       Z.I'Size= 17
       Increment= 124

    I leave disassembling to you as an exercise.

    Why me? You're the one with an Ada compiler (but see below). I can
    emulate that record using:

    record Packed =
    int32 cont : (h:1, i:17, j:1)
    end

    If I do Z.i := Z.i + 1, my compiler generates 11 instructions, shown
    below (for a record which starts at a byte boundary).

    With a conventional u32 field, it can be done with one instruction
    (especially if I use ++Z.i), but I generate 2-3 instructions, I think
    because it's faster.

    The debate is not about whether it's possible at all. It is:

    * Should this be done using the type system as in Ada

    * What are the exact rules for laying these out in memory, especially consecutive variables.

    * Does the language strictly treat these as separate types as in Ada,
    which will stop you adding a u17 type to a u18 type, or probably even
    two different 17-bit types? Which makes using the language a nightmare.

    * Ada apparently allows packed arrays of such a type too; the mind
    boggles as to what code would be produced for array indexing or for
    slicing, since each element would be at a different byte and bit alignment.

    * What are the limits for these types? Ada stops at 64 bits. My language
    ought to allow up to 128-bit bitfields, but I only implement up to 64
    because the amount of support needed would be huge, for very little
    benefit. James' proposal allow any size from 1 to 128 and beyond, when I
    guess bit-packing is less importance.

    My view is that it makes the language too complicated, too hard to
    implement, and too hard for a user to understand what trade-offs are
    being done behind the scenes.

    If you look at my example, there it is clear that the h, i, j fields are
    inside a container field whose behaviour is well understood. In my
    language also, all such fields are just integers with no restriction on
    what you can do with those values. There are just limits on the range
    they can represent.

    You (DAK) may not like that, but so what?


    My code for Z.i:=Z.i+1 (64-bit): -----------------------------------------------------
    movsx D0, word32 [Dframe+start.z]
    shr D0, 1
    and D0, 131071
    inc D0
    lea D1, [Dframe+start.z]
    movsx D3, word32 [D1]
    mov D2, -262143
    shl D0, 1
    and D3, D2
    or D3, D0
    mov [D1], A3 # Dregs are 64 bits, Aregs are 32 -----------------------------------------------------


    Ada unoptimised code for Z.I:=Z.I+1 using godbolt.org (optimising
    eliminates all the code) (looks like 32-bit): -----------------------------------------------------
    mov eax, DWORD PTR [rbp-12] # tmp105, z
    shr eax # _1
    and eax, 131071 # _1,
    add eax, 1 # _3,
    and eax, 131071 # _4,
    and eax, 131071 # tmp107,
    lea edx, [rax+rax] # tmp108,
    mov eax, DWORD PTR [rbp-12] # tmp109,
    and eax, -262143 # tmp110,
    or eax, edx # tmp111, tmp108
    mov DWORD PTR [rbp-12], eax #, tmp111 -----------------------------------------------------

    If I change Z.I to a u32 field (and do the same for Z.H and Z.J so that
    it is not sandwiched between two 1-bit fields), then the code becomes:

    mov eax, DWORD PTR [rbp-28] # _1, z.F.i
    add eax, 1 # _2,
    mov DWORD PTR [rbp-28], eax # z.F.i, _2


    Well I make the rules for it so how can it be broken?

    Broke rules, broken language.


    So having 255 + 1 = 256 is broken?!

    In your Ada code, if I make the field 'u2' (mod 2**2 and size 2), and
    store 2 into Z.I, then print Z.I+Z.I, it tells me that 2 + 2 = 0!

    If I try the same in my language:

    record Packed =
    int32 cont : (x:2, y:2)
    end

    Packed z

    z.x:=2
    z.y:=2

    fprintln "# + # is #", z.x, z.y, z.x+z.y

    It says:

    2 + 2 is 4

    I think I'll stick with mine, thank you very much!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Fri Sep 3 15:05:17 2021
    On 2021-09-03 13:34, Bart wrote:

    The debate is not about whether it's possible at all. It is:

    They why were you keeping asking how to do it?

    So having 255 + 1 = 256 is broken?!

    Yes when the value 256 does not belong to the type.

    In your Ada code, if I make the field 'u2' (mod 2**2 and size 2), and
    store 2 into Z.I, then print Z.I+Z.I, it tells me that 2 + 2 = 0!

    Which is exactly right.

    2 + 2 = 0 (mod 4)

    You did not read the Wikipedia and Wolfram articles I posted links to,
    did you?

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Fri Sep 3 15:19:43 2021
    On 03/09/2021 14:05, Dmitry A. Kazakov wrote:
    On 2021-09-03 13:34, Bart wrote:

    The debate is not about whether it's possible at all. It is:

    They why were you keeping asking how to do it?

    Because you keep saying that these odd types are just as efficient and
    as easy to implement as machine-friendly types. I posted the code that
    you should have posted.


    So having 255 + 1 = 256 is broken?!

    Yes when the value 256 does not belong to the type.

    In your Ada code, if I make the field 'u2' (mod 2**2 and size 2), and
    store 2 into Z.I, then print Z.I+Z.I, it tells me that 2 + 2 = 0!

    Which is exactly right.

      2 + 2 = 0 (mod 4)

    OK, I will get rid of that mod type and use this:

    type u2 is range 0..3;

    Now 2 + 2 = 4 when though I'm adding two u2 types and even though I'm
    using u2'Image on that 4.

    Exactly the same as my language, which you said was broken. Here, run this:

    with Ada.Text_IO; use Ada.Text_IO;
    procedure Test is
    type byte is range 0..255;
    Z: byte := 255;

    begin
    Put_Line (byte'Image (Z+1));
    end Test;

    This displays 256, which you said was wrong for my language:

    byte Z:=255
    println Z+1

    But look closely at the Ada: it is using 'byte' in the print line, yet
    it displays a value that is not a byte value. If I do:

    println byte(Z+1)

    it shows 0!

    I don't think it's my language that's broken....

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Fri Sep 3 17:19:18 2021
    On 03/09/2021 13:34, Bart wrote:
    On 03/09/2021 08:23, Dmitry A. Kazakov wrote:
    On 2021-09-03 00:08, Bart wrote:


    Ada unoptimised code for Z.I:=Z.I+1 using godbolt.org (optimising
    eliminates all the code) (looks like 32-bit):

    It would be helpful here if Dmitry wrote some Ada code that has
    side-effects, but does not include such big functions as printing, nor
    should it be a complete program.

    For example, if i wanted to show how a C compiler would handle
    incrementing a 32-bit int, I could write :

    #include <stdint.h>

    volatile int32_t vx, vy;

    void foo1(void) {
    int32_t x = vx;
    x++;
    vy = x;
    }

    int32_t foo2(int32_t a) {
    int32_t x = a;
    x++;
    return x;
    }

    Whether "foo1" or "foo2" styles are used is not that important. What
    /is/ important is that you can run the compiler with optimisation
    enabled, since comparing unoptimised code for quality, size or speed is
    utterly meaningless. And the interesting code should not be swamped
    with Put_Line, printf, and the like. And of course variations such as
    "mod 2**17" and "mod 2**17 with Size => 17" need to be in separate
    functions so that they can be compared.

    And ideally, these should be given as godbolt links, such as <https://godbolt.org/z/Pah84b9c9>. That way, we can all look at the
    source code, look at the generated code, and play around with it.




    Well I make the rules for it so how can it be broken?

    Broke rules, broken language.


    So having 255 + 1 = 256 is broken?!

    Yes, if the type in question is "unsigned modulo 2 ** 8". Then the
    correct result for 255 + 1 is 0.

    That's what you expect from code in C such as :

    (uint8_t) ((uint8_t) 255 + (uint8_t) 1)

    or :

    uint8_t a = 255;
    uint8_t b = 1;
    uint8_t c = a + b;


    (It's perhaps worth pointing out that some of C's integer promotion
    rules mean that this kind of code, using casting or assignment, does not
    always work as expected. In particular, if you multiply unsigned types
    that are smaller than int, it is possible to get undefined behaviour
    when those unsigned types are first promoted to signed integers, which
    might overflow.)


    In your Ada code, if I make the field 'u2' (mod 2**2 and size 2), and
    store 2 into Z.I, then print Z.I+Z.I, it tells me that 2 + 2 = 0!


    And the Ada code would be correct. 2 + 2 mod 4 is 0.

    If I try the same in my language:

        record Packed =
            int32 cont : (x:2, y:2)
        end

        Packed z

        z.x:=2
        z.y:=2

        fprintln "# + # is #", z.x, z.y, z.x+z.y

    It says:

        2 + 2 is 4

    I think I'll stick with mine, thank you very much!


    Your code is wrong - /if/ you have types that have the same specific
    semantics, such as modulo wrapping.

    You appear to have a kind of integer promotion semantics, not unlike C.
    I don't know the details of how your language works here.

    In C, if you write "a = b + c;", you are saying (assuming integers) :

    1. Take the value of b, convert it to an "int" if it is a smaller type.
    Call this "b1", of type B1.
    2. Take the value of c, convert it to an "int" if it is a smaller type.
    Call this "c1", of type C1.
    3. Figure out the lowest common denominator type T that is at least as
    many bits as B1 and C1. If B1 and C1 have the same signedness, T has
    that signedness - if not, T is signed.
    4. Do the addition as type T, giving result "x". If T is signed, ignore
    the possibility of overflow. If T is unsigned, use wrapping semantics.
    5. Convert "x" to the type of "a". If "a" is unsigned, use wrapping
    semantics. If "a" is signed, use whatever semantics the implementation
    says you should use.
    6. Set "a" to that final converted value.


    In Ada, it is quite a lot simpler:

    1. If a, b and c don't have compatible types, it's a compiler error.
    2. Evaluate "b + c" according to the semantics of their type. That
    might mean modulo wrapping, or throwing an exception on overflow -
    whatever the programmer has asked for when giving the types.
    3. Assign the result to "a".


    I'm guessing that your language is basically like C, except that I
    expect you to have wrapping semantics for signed types, and your "int"
    is 64-bit on 64-bit systems. Is that right?


    While I am not a big fan of Ada in general - it's far too wordy for my
    tastes - I think its arithmetic semantics are clearer, more consistent,
    and more flexible. I'd rather C did not have integer promotion.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Fri Sep 3 17:17:30 2021
    On 2021-09-03 16:19, Bart wrote:
    On 03/09/2021 14:05, Dmitry A. Kazakov wrote:
    On 2021-09-03 13:34, Bart wrote:

    The debate is not about whether it's possible at all. It is:

    They why were you keeping asking how to do it?

    Because you keep saying that these odd types are just as efficient and
    as easy to implement as machine-friendly types.

    Nobody said anything about efficiency except that compiler
    implementation will be always more efficient than the programmer's one.

    So having 255 + 1 = 256 is broken?!

    Yes when the value 256 does not belong to the type.

    In your Ada code, if I make the field 'u2' (mod 2**2 and size 2), and
    store 2 into Z.I, then print Z.I+Z.I, it tells me that 2 + 2 = 0!

    Which is exactly right.

       2 + 2 = 0 (mod 4)

    OK, I will get rid of that mod type and use this:

       type u2 is range 0..3;

    Now 2 + 2 = 4 when though I'm adding two u2 types and even though I'm
    using u2'Image on that 4.

    Exactly the same as my language, which you said was broken. Here, run this:

     with Ada.Text_IO; use Ada.Text_IO;
     procedure Test is
         type byte is range 0..255;
         Z: byte := 255;
     begin
        Put_Line (byte'Image (Z+1));
     end Test;

    This displays 256,

    This correct because the attribute Image is defined on byte'Base and
    Byte'Base is a different [sub]type.

    Try this instead:

    with Ada.Text_IO; use Ada.Text_IO;
    procedure Test is
    type byte is range 0..255;
    Z: byte := 255;
    begin
    Z := Z + 1;
    Put_Line (byte'Image (Z));
    end Test;

    See, you get error. Because Z is 0..255.

    Now try this:

    with Ada.Text_IO; use Ada.Text_IO;
    procedure Test is
    type Byte is range 0..255;
    Z : Byte := 255;
    begin
    Put_Line (Integer'Image (Byte'Base'Size));
    end Test;

    You will get 16, because the compiler selected a 16-bit integer as the
    base type for byte. It must do this because the language requires that
    the base type of an integer type included the range symmetric around zero.

    Put_Line (byte'Image (Z+1));

    Means, fully attributed:

    Put_Line (byte'base'Image (byte'base(Z) + byte'base'(1)));

    As I said before, all computations are made in the machine type
    byte'base. So Z+1 computes. Only on parameter passing and assignments
    checks fire. In this case there is no checks because byte'Image is
    defined on byte'Base.

    You could force the check in this small modification:

    with Ada.Text_IO; use Ada.Text_IO;
    procedure Test is
    type Byte is range 0..255;
    Z : Byte := 255;
    begin
    Put_Line (Byte'Image (byte'(Z+1)));
    end Test;

    Here I required the expression Z+1 be of the type byte. Now this will
    fail as expected.

    As I always say to you and to James, specify the types, the rest will
    clarify itself.

    which you said was wrong for my language:

       byte Z:=255
       println Z+1

    I said, *if* println is defined on byte. Is it? In the case of Ada it is
    *not*.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Fri Sep 3 18:30:42 2021
    On 03/09/2021 16:19, David Brown wrote:
    On 03/09/2021 13:34, Bart wrote:
    On 03/09/2021 08:23, Dmitry A. Kazakov wrote:
    On 2021-09-03 00:08, Bart wrote:


    Ada unoptimised code for Z.I:=Z.I+1 using godbolt.org (optimising
    eliminates all the code) (looks like 32-bit):

    It would be helpful here if Dmitry wrote some Ada code that has
    side-effects, but does not include such big functions as printing, nor
    should it be a complete program.

    For example, if i wanted to show how a C compiler would handle
    incrementing a 32-bit int, I could write :

    #include <stdint.h>

    volatile int32_t vx, vy;

    void foo1(void) {
    int32_t x = vx;
    x++;
    vy = x;
    }

    int32_t foo2(int32_t a) {
    int32_t x = a;
    x++;
    return x;
    }

    Whether "foo1" or "foo2" styles are used is not that important. What
    /is/ important is that you can run the compiler with optimisation
    enabled, since comparing unoptimised code for quality, size or speed is utterly meaningless.

    Extracting and inserting bitfields will always have extra bit-twiddling.
    And there will be extra rules as to how they will be laid out.

    So having 255 + 1 = 256 is broken?!

    Yes, if the type in question is "unsigned modulo 2 ** 8". Then the
    correct result for 255 + 1 is 0.

    The type of what? "+" is defined in my language between two matching
    types from this set (pointers not shown):

    i64 u64 r32 r64 i128 u128

    Operands of other types will be converted to suitable types as needed,
    but the result will be one of these as well.

    (In-place "+", ie. augmented assignment, has different rules.)

    I think I'll stick with mine, thank you very much!


    Your code is wrong - /if/ you have types that have the same specific semantics, such as modulo wrapping.

    Yes it has modulo wrapping for unsigned types. But that happens at this
    point:

    18446744073709551615 + 1

    Those 8-bit or 2-bit values are promoted to u64, and both 255 and 3 are
    a long way below that threshold.

    In Ada, it is quite a lot simpler:

    1. If a, b and c don't have compatible types, it's a compiler error.

    I use this approach for more elaborate types, such as user-defined types.

    Other languages make it more complicated with classes and inheritance
    and overloading of multi-parameter functions. Then it makes simple
    numeric promotions seem like child's play!

    When I'm working with just numbers, then I don't need restrictions:

    x := 1e1'000'000L
    y := 3

    println x
    println x + y.[0]

    (Dynamic code.) Here I'm adding x, a 1 million-digit number, with the
    value of the least significant bit of x. The output is:

    1.e1000000
    1000 .... 0001

    No fuss. Ada on the other hand would probably have kittens.


    I'm guessing that your language is basically like C, except that I
    expect you to have wrapping semantics for signed types, and your "int"
    is 64-bit on 64-bit systems. Is that right?

    Yes. In the past I didn't have promotions. So if A and B were byte u8
    types, then arithmetic was done at 8 bits, and 255 + 1 would give 0.

    This was useful on smaller devices where 8-bit add was more efficient
    than 16 bits or more.

    It's not so useful now, in an 'open' expression when the result is not
    put back into a byte type, but used more generally. And it gives
    surprising results with:

    A + 1

    when A is 255 for example.

    (Also, I got the impression that not all processors support arithmetic
    on narrower types.)

    While I am not a big fan of Ada in general - it's far too wordy for my
    tastes - I think its arithmetic semantics are clearer, more consistent,
    and more flexible.

    Ada is just too heavy-going and much too strict for my taste. Its type
    system would mean stopping to do battle with the language every five
    minutes, and creating ever more elaborate workarounds when you've tied
    yourself up in knots.

    I had enough trouble with my 'byte' and 'char' types. The casts needed
    to convert between ref byte and ref char started to poison everything.
    In the end I just allowed implicit conversions between them.

    Maybe it will suit your work (why don't you use it anyway?). But it
    doesn't suit my stuff.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Sat Sep 4 13:04:45 2021
    On 03/09/2021 19:30, Bart wrote:
    On 03/09/2021 16:19, David Brown wrote:
    On 03/09/2021 13:34, Bart wrote:
    On 03/09/2021 08:23, Dmitry A. Kazakov wrote:
    On 2021-09-03 00:08, Bart wrote:


    Ada unoptimised code for Z.I:=Z.I+1 using godbolt.org (optimising
    eliminates all the code) (looks like 32-bit):

    It would be helpful here if Dmitry wrote some Ada code that has
    side-effects, but does not include such big functions as printing, nor
    should it be a complete program.

    For example, if i wanted to show how a C compiler would handle
    incrementing a 32-bit int, I could write :

    #include <stdint.h>

    volatile int32_t vx, vy;

    void foo1(void) {
         int32_t x = vx;
         x++;
         vy = x;
    }

    int32_t foo2(int32_t a) {
         int32_t x = a;
         x++;
         return x;
    }

    Whether "foo1" or "foo2" styles are used is not that important.  What
    /is/ important is that you can run the compiler with optimisation
    enabled, since comparing unoptimised code for quality, size or speed is
    utterly meaningless.

    Extracting and inserting bitfields will always have extra bit-twiddling.

    Usually that is the case. But these 17-bit modulo integers are not
    quite the same as bitfields, because they would not normally be packed.
    So you don't need anything extra to extract them, and you don't always
    need masks or modulo instructions when storing them. For example, after bit-wise logical operations, there is no masking needed, or if the
    compiler knows enough about the actual ranges to be sure there can be no wrapping.

    And there will be extra rules as to how they will be laid out.


    Yes, you need rules for that.

    So having 255 + 1 = 256 is broken?!

    Yes, if the type in question is "unsigned modulo 2 ** 8".  Then the
    correct result for 255 + 1 is 0.

    The type of what? "+" is defined in my language between two matching
    types from this set (pointers not shown):

      i64 u64 r32 r64 i128 u128


    That is basically like C. Ada is different. (As is C++, when you use
    your own types.) In Ada, if you take two uint8_t (or whatever they are
    called in Ada) variables and add them, you are doing "uint8_t addition",
    not promoted integer addition.

    Operands of other types will be converted to suitable types as needed,
    but the result will be one of these as well.

    (In-place "+", ie. augmented assignment, has different rules.)

    I think I'll stick with mine, thank you very much!


    Your code is wrong - /if/ you have types that have the same specific
    semantics, such as modulo wrapping.

    Yes it has modulo wrapping for unsigned types. But that happens at this point:

      18446744073709551615 + 1

    Those 8-bit or 2-bit values are promoted to u64, and both 255 and 3 are
    a long way below that threshold.


    Different languages, different rules.

    I don't think it is appropriate to say that one way is "right" and the
    other is "broken" - both are "broken" when viewed with the other set of
    rules.

    In Ada, it is quite a lot simpler:

    1. If a, b and c don't have compatible types, it's a compiler error.

    I use this approach for more elaborate types, such as user-defined types.

    Other languages make it more complicated with classes and inheritance
    and overloading of multi-parameter functions. Then it makes simple
    numeric promotions seem like child's play!

    When I'm working with just numbers, then I don't need restrictions:

        x := 1e1'000'000L
        y := 3

        println x
        println x + y.[0]

    (Dynamic code.) Here I'm adding x, a 1 million-digit number, with the
    value of the least significant bit of x. The output is:

        1.e1000000
        1000 .... 0001

    No fuss. Ada on the other hand would probably have kittens.


    Ada is designed to be strict - you get exactly what you ask for, but you
    have to ask exactly for what you want. The idea is that it should be
    very difficult to write (and compile) incorrect code, at the cost of
    making it a bit more difficult to write correct code. Your language is
    at the other end of the scale, where you make it as easy as possible to
    write correct code, at the cost of making it easier to write incorrect
    code. There is room in the world for different languages along that scale.


    I'm guessing that your language is basically like C, except that I
    expect you to have wrapping semantics for signed types, and your "int"
    is 64-bit on 64-bit systems.  Is that right?

    Yes. In the past I didn't have promotions. So if A and B were byte u8
    types, then arithmetic was done at 8 bits, and 255 + 1 would give 0.

    This was useful on smaller devices where 8-bit add was more efficient
    than 16 bits or more.

    It's not so useful now, in an 'open' expression when the result is not
    put back into a byte type, but used more generally. And it gives
    surprising results with:

      A + 1

    when A is 255 for example.

    That will depend on how you view the type of literals like "1".


    (Also, I got the impression that not all processors support arithmetic
    on narrower types.)

    The implementation here is of secondary importance. The vital part is
    how the semantics of the language are defined - without that part being
    clear, there is no point in the language. It is, of course, useful to
    consider typical targets when you define the features of the language.
    But if it is to be a portable high-level language, then the semantics
    are the prime concern.


    While I am not a big fan of Ada in general - it's far too wordy for my
    tastes - I think its arithmetic semantics are clearer, more consistent,
    and more flexible.

    Ada is just too heavy-going and much too strict for my taste. Its type
    system would mean stopping to do battle with the language every five
    minutes, and creating ever more elaborate workarounds when you've tied yourself up in knots.


    As I say, there's room for different types of language. Trying to force aspects of one style of language onto a language of different style is
    going to be trouble. You don't use a forgiving, easy-to write language
    like Python when you are making a car engine controller or medical
    equipment. You don't use a strict, hard-to-write language like Ada when
    making a little utility program on a PC. There is also room for
    different kinds of programmers - someone who feels "workarounds" are appropriate should not be doing the kind of programming where Ada is a
    typical language choice, whereas someone who insists that every function written has full documentation and unit tests and is reviewed by at
    least two independent groups is not going to work on a task where time-to-market and development costs are more important than quality.

    I had enough trouble with my 'byte' and 'char' types. The casts needed
    to convert between ref byte and ref char started to poison everything.
    In the end I just allowed implicit conversions between them.

    Maybe it will suit your work (why don't you use it anyway?). But it
    doesn't suit my stuff.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Sat Sep 4 13:35:38 2021
    On 04/09/2021 12:04, David Brown wrote:
    On 03/09/2021 19:30, Bart wrote:

    Yes it has modulo wrapping for unsigned types. But that happens at this
    point:

      18446744073709551615 + 1

    Those 8-bit or 2-bit values are promoted to u64, and both 255 and 3 are
    a long way below that threshold.


    Different languages, different rules.

    I don't think it is appropriate to say that one way is "right" and the
    other is "broken" - both are "broken" when viewed with the other set of rules.

    I just have a different way of treating numeric types. So i64 is a
    signed integer type, and i8 i16 i32 are just narrower, storage versions
    of the /same type/.

    The same as when you fill in a value in your tax return in a space with,
    say, 6 boxes, to allow quantities up to 999,999. Inserting or extracting
    values doesn't mean a change of type or meaning. It was just more
    practical than providing an infinitely long entry field, especially on a
    paper form.

    Bitfields of any integer value are the same, although I treat them as
    subsets of u64. (Or u128, except I will probably never get around to
    supporting bitfields over 64 bits.)

    (There are also bit-slices, in my other language, but these are not
    integers at all; they're arrays, even if they have exactly 32 elements,
    but they could be a billion bits wide. They'd need conversion into a
    numeric type.)

    When I'm working with just numbers, then I don't need restrictions:

        x := 1e1'000'000L
        y := 3

        println x
        println x + y.[0]

    No fuss. Ada on the other hand would probably have kittens.


    Ada is designed to be strict - you get exactly what you ask for, but you
    have to ask exactly for what you want. The idea is that it should be
    very difficult to write (and compile) incorrect code, at the cost of
    making it a bit more difficult to write correct code. Your language is
    at the other end of the scale, where you make it as easy as possible to
    write correct code, at the cost of making it easier to write incorrect
    code.

    For the static language, yes. For the dynamic one, in which I wrote that example (it was not worthwhile to add arbitrary precision into the
    other), it actually does quite a lot of runtime checks, while still
    letting you do lots of naughty or underhand things if you want:

    function peek(addr, t=byte)=
    return makeref(addr,t)^
    end

    print peek(0x400'000):"h"

    (This shows '4d', which is the first byte of the Windows PE/exe file
    format, in this case corresponding to the loaded image of the
    interpreter program, which in Windows are always loaded at virtual
    address 0x400000.)

    It's not so useful now, in an 'open' expression when the result is not
    put back into a byte type, but used more generally. And it gives
    surprising results with:

      A + 1

    when A is 255 for example.

    That will depend on how you view the type of literals like "1".

    Exactly. There can be a surprising amount of confusion about the types
    of literals.

    Because of the way my language treats mixed signed/unsigned arithmetic
    (as signed operations), and because literals 0 to 2**63-1 are signed,
    then this can give unexpected results here:

    u64 A = 0xC000'0000'000'000

    if A > 1 then ...

    A is treated as signed, so a negative value in this case: it will give
    the opposite behaviour to what is expected (A <= 1). So I had to
    introduce some special rules for certain operators such as relative
    compares:

    A > B Not allowed for mixed types (cast required)
    A > 1 When A is unsigned then so is the literal

    (Yeah, I got bitten by this.... But it shows I can recognise a potential
    source of bugs and can do something about it)

    As I say, there's room for different types of language.

    Not according to DAK. There could only be one language (ideally a lot
    better than Ada, but Ada will do if the ideal doesn't exist); it doesn't
    matter how big or complex or slow to compile or inefficient it is; and
    it can only employ the latest concepts in distributed computing (or
    whatever he was on about) in every application, even if it doesn't need it.


    You don't use a forgiving, easy-to write language
    like Python when you are making a car engine controller or medical
    equipment.

    Apparently it's used a lot in all sorts of 'enterprise' areas like
    finance, or apps like Instagram.

    But I don't like it (mostly the language, but also 'Pythonistas').

    You don't use a strict, hard-to-write language like Ada when
    making a little utility program on a PC.

    See above!

    There is also room for
    different kinds of programmers - someone who feels "workarounds" are appropriate should not be doing the kind of programming where Ada is a typical language choice, whereas someone who insists that every function written has full documentation and unit tests and is reviewed by at
    least two independent groups is not going to work on a task where time-to-market and development costs are more important than quality.

    Agreed (for a change).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Sat Sep 4 15:39:39 2021
    On 04/09/2021 14:35, Bart wrote:
    On 04/09/2021 12:04, David Brown wrote:
    On 03/09/2021 19:30, Bart wrote:

    Ada is designed to be strict - you get exactly what you ask for, but you
    have to ask exactly for what you want.  The idea is that it should be
    very difficult to write (and compile) incorrect code, at the cost of
    making it a bit more difficult to write correct code.  Your language is
    at the other end of the scale, where you make it as easy as possible to
    write correct code, at the cost of making it easier to write incorrect
    code.

    For the static language, yes. For the dynamic one, in which I wrote that example (it was not worthwhile to add arbitrary precision into the
    other), it actually does quite a lot of runtime checks, while still
    letting you do lots of naughty or underhand things if you want:


    Strict and safe programming languages aim to catch problems at
    compile-time, not just run-time. They usually have lots of run-time
    checks available too (with choices for balance between debug aids and
    run-time costs). The old joke about Ada programming is if you can get
    it to compile, it is ready to ship.


    It's not so useful now, in an 'open' expression when the result is not
    put back into a byte type, but used more generally. And it gives
    surprising results with:

       A + 1

    when A is 255 for example.

    That will depend on how you view the type of literals like "1".

    Exactly. There can be a surprising amount of confusion about the types
    of literals.

    Because of the way my language treats mixed signed/unsigned arithmetic
    (as signed operations), and because literals 0 to 2**63-1 are signed,
    then this can give unexpected results here:

        u64 A = 0xC000'0000'000'000

        if A > 1 then ...

    A is treated as signed, so a negative value in this case: it will give
    the opposite behaviour to what is expected (A <= 1). So I had to
    introduce some special rules for certain operators such as relative
    compares:

       A > B         Not allowed for mixed types (cast required)
       A > 1         When A is unsigned then so is the literal

    (Yeah, I got bitten by this.... But it shows I can recognise a potential source of bugs and can do something about it)

    Ada also solves this kind of problem by not allowing comparisons between different types. (I don't know how it handles literals - that's beyond
    my rather limited knowledge of the language.)

    C, as you know, has its integer promotion rules - and since "A" is
    unsigned 64-bit (probably therefore "unsigned long long int"), "1"
    starts off as a plain "int" but gets promoted to "unsigned long long
    int". The comparison is unsigned.

    It does not really matter if a language says unsigned types sometimes
    get treated as signed, or signed types sometimes get treated as unsigned
    - both are going to get things wrong at times. The only promotions that
    make sense are ones that guarantee to preserve all possible values -
    promotions that change signedness should therefore often also change size.

    I find a reasonable compromise in C (and C++) using gcc with
    "-Wsign-compare". If the mixed-sign comparison could give the wrong
    answer, you get a warning - if the comparison is safe (like "A > 1"),
    there is no warning.

    I can appreciate your intentions about having the type of the literal
    change according to its use, but to me that would not be an advantage -
    it's too easy to be inconsistent, or for the rules to be unclear. I
    like that in C and C++, the type of an expression and the way it is
    evaluated is determined solely by the contents of the expression, not
    how it is used in a wider expression. That applies to literals too.


    As I say, there's room for different types of language.

    Not according to DAK. There could only be one language (ideally a lot
    better than Ada, but Ada will do if the ideal doesn't exist); it doesn't matter how big or complex or slow to compile or inefficient it is; and
    it can only employ the latest concepts in distributed computing (or
    whatever he was on about) in every application, even if it doesn't need it.


    You don't use a forgiving, easy-to write language
    like Python when you are making a car engine controller or medical
    equipment.

    Apparently it's used a lot in all sorts of 'enterprise' areas like
    finance, or apps like Instagram.


    For some applications, "tested correct" is appropriate - and Python is
    fine for that. For others, "designed correct" is required, and then
    Python quickly becomes less feasible. And for a few, "proven correct"
    is required - then you need something like SPARK (a subset of Ada). If
    you tried to write Instagram for mobiles in SPARK, you'd never be
    finished - the world would have moved on to holographic implants instead
    of mobiles before your proofs were complete.

    But I don't like it (mostly the language, but also 'Pythonistas').

    You don't use a strict, hard-to-write language like Ada when
    making a little utility program on a PC.

    See above!

    There is also room for
    different kinds of programmers - someone who feels "workarounds" are
    appropriate should not be doing the kind of programming where Ada is a
    typical language choice, whereas someone who insists that every function
    written has full documentation and unit tests and is reviewed by at
    least two independent groups is not going to work on a task where
    time-to-market and development costs are more important than quality.

    Agreed (for a change).


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to David Brown on Sat Sep 4 17:16:17 2021
    On 2021-09-04 15:39, David Brown wrote:

    Ada also solves this kind of problem by not allowing comparisons between different types. (I don't know how it handles literals - that's beyond
    my rather limited knowledge of the language.)

    When operations can be overloaded in the result type that simplifies a
    lot. Literals are semantically overloaded parameterless functions. 1
    Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1 My_Custom_Integer etc.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Dmitry A. Kazakov on Sun Sep 5 10:54:21 2021
    On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
    On 2021-09-04 15:39, David Brown wrote:

    Ada also solves this kind of problem by not allowing comparisons between
    different types.  (I don't know how it handles literals - that's beyond
    my rather limited knowledge of the language.)

    When operations can be overloaded in the result type that simplifies a
    lot. Literals are semantically overloaded parameterless functions. 1
    Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1 My_Custom_Integer etc.


    I'm not very keen on overloading in the result type - it feels to me
    that it would be too easy to lose track of what is going on, and to too
    easy to have code that appears identical (same expression, same
    variables, same types, etc.) but completely different effects.

    I must admit I haven't tried result type overloading (I've only played
    very briefly with Ada). But I'm sceptical.


    (It's fine for the compiler to optimise expression evaluation depending
    on the return type, but that's another matter. That affects the
    efficiency of the code, but not the semantics or the results.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to David Brown on Sun Sep 5 11:39:30 2021
    On 2021-09-05 10:54, David Brown wrote:
    On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
    On 2021-09-04 15:39, David Brown wrote:

    Ada also solves this kind of problem by not allowing comparisons between >>> different types.  (I don't know how it handles literals - that's beyond >>> my rather limited knowledge of the language.)

    When operations can be overloaded in the result type that simplifies a
    lot. Literals are semantically overloaded parameterless functions. 1
    Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1 My_Custom_Integer etc. >>

    I'm not very keen on overloading in the result type - it feels to me
    that it would be too easy to lose track of what is going on, and to too
    easy to have code that appears identical (same expression, same
    variables, same types, etc.) but completely different effects.

    Why:

    declare
    X : T;
    Y : S;
    begin
    Foo (X);
    Foo (Y);

    is OK, but

    declare
    X : T := Create;
    Y : S := Create;
    begin

    is not?

    Anyway, isn't it nice to have 1 written as 1 instead of ugly 1ULL?

    Same with string and character literals. You just write:

    S1 : String := "abc"; -- Latin-1
    S2 : Wide_String := "abc"; -- UCS-2
    S3 : Wide_Wide_String := "abc"; -- UCS-4
    begin
    if S2(3) = 'c' then -- Here 'c' is resolved to 16-bit character
    ...

    I must admit I haven't tried result type overloading (I've only played
    very briefly with Ada). But I'm sceptical.

    It is an arbitrary limitation made by lazy compiler writes (we know some
    (:-)). Without result overloading they can resolve all types strictly bottom-up.

    BTW, there is a similar case with overriding. In C++ only the first
    [hidden] argument supports overriding. In Ada terms it is a controlled argument. In Ada any argument and/or the result can be controlled. So
    you could dispatch on the result.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Sun Sep 5 11:08:06 2021
    On 05/09/2021 10:39, Dmitry A. Kazakov wrote:
    On 2021-09-05 10:54, David Brown wrote:
    On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
    On 2021-09-04 15:39, David Brown wrote:

    Ada also solves this kind of problem by not allowing comparisons
    between
    different types.  (I don't know how it handles literals - that's beyond >>>> my rather limited knowledge of the language.)

    When operations can be overloaded in the result type that simplifies a
    lot. Literals are semantically overloaded parameterless functions. 1
    Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1 My_Custom_Integer
    etc.


    I'm not very keen on overloading in the result type - it feels to me
    that it would be too easy to lose track of what is going on, and to too
    easy to have code that appears identical (same expression, same
    variables, same types, etc.) but completely different effects.

    Why:

       declare
          X : T;
          Y : S;
       begin
          Foo (X);
          Foo (Y);

    is OK, but

       declare
          X : T := Create;
          Y : S := Create;
       begin

    is not?

    Anyway, isn't it nice to have 1 written as 1 instead of ugly 1ULL?

    That is merely a consequence of most C implementations being capped at a
    32-bit type. The language allows a wider int.

    I can write this:

    u64 a:=0
    u128 b:=18446744073709551615

    println a + 1
    println b + 1

    Putput is:

    1
    18446744073709551616

    Promotion rules will widen the literal when used in a binary op.

    However in other cases it won't; here it displays zero:

    println 18446744073709551615 + 1

    In this case, I need to use casts, eg:

    println 18446744073709551615 + u128(1)

    In Ada however it doesn't compile.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Bart on Sun Sep 5 11:52:11 2021
    On 05/09/2021 11:08, Bart wrote:
    On 05/09/2021 10:39, Dmitry A. Kazakov wrote:

    Anyway, isn't it nice to have 1 written as 1 instead of ugly 1ULL?

    In this case, I need to use casts, eg:

        println 18446744073709551615 + u128(1)

    In Ada however it doesn't compile.

    Ada can't display that literal anyway. I can try something like this,
    but only for i64.max not u64.max:

    Put_Line (Long_Long_Integer'Image (9223372036854775807));

    This of course is not ugly at all.

    Isn't it wonderful to just do 'print x', whether x is a named constant, variable or literal, and not worry about what type it happens to be?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Sun Sep 5 12:43:50 2021
    On 2021-09-05 12:08, Bart wrote:
    On 05/09/2021 10:39, Dmitry A. Kazakov wrote:
    On 2021-09-05 10:54, David Brown wrote:
    On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
    On 2021-09-04 15:39, David Brown wrote:

    Ada also solves this kind of problem by not allowing comparisons
    between
    different types.  (I don't know how it handles literals - that's
    beyond
    my rather limited knowledge of the language.)

    When operations can be overloaded in the result type that simplifies a >>>> lot. Literals are semantically overloaded parameterless functions. 1
    Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1
    My_Custom_Integer etc.


    I'm not very keen on overloading in the result type - it feels to me
    that it would be too easy to lose track of what is going on, and to too
    easy to have code that appears identical (same expression, same
    variables, same types, etc.) but completely different effects.

    Why:

        declare
           X : T;
           Y : S;
        begin
           Foo (X);
           Foo (Y);

    is OK, but

        declare
           X : T := Create;
           Y : S := Create;
        begin

    is not?

    Anyway, isn't it nice to have 1 written as 1 instead of ugly 1ULL?

    That is merely a consequence of most C implementations being capped at a 32-bit type. The language allows a wider int.

    I can write this:

        u64 a:=0
        u128 b:=18446744073709551615

        println a + 1
        println b + 1

    Putput is:

        1
        18446744073709551616

    Promotion rules will widen the literal when used in a binary op.

    However in other cases it won't; here it displays zero:

        println 18446744073709551615 + 1

    In this case, I need to use casts, eg:

        println 18446744073709551615 + u128(1)

    In Ada however it doesn't compile.

    Of course it does:

    with Ada.Text_IO; use Ada.Text_IO;
    procedure Test is
    begin
    Put_Line (Long_Long_Long_Integer'Image (18446744073709551615 + 1));
    end Test;

    Prints:

    18446744073709551616

    No casts needed.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Sun Sep 5 12:00:38 2021
    On 05/09/2021 11:43, Dmitry A. Kazakov wrote:
    On 2021-09-05 12:08, Bart wrote:

    In Ada however it doesn't compile.

    Of course it does:

    with Ada.Text_IO; use Ada.Text_IO;
    procedure Test is
    begin
       Put_Line (Long_Long_Long_Integer'Image (18446744073709551615 + 1));
    end Test;

    Prints:

     18446744073709551616

    No casts needed.

    OK, my last post just crossed yours. I'd figured this out. Except my
    Gnat implementation doesn't support Long_Long_Long (neither does rextester.com).

    But even if it did, really? Apart from being uglier than 1ULL, what sort
    of type is that? It might as well be Very_Very_Long_Integer!

    I no longer use relative type names, but the last ones I used were
    'dint' and 'dword', with the 'd' standing for 'double', in this case
    exactly double the width of whatever int and word were.

    BTW it looks to me like you're having to use an i128 type in order to
    represent a value that fits into u64.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Sun Sep 5 13:11:06 2021
    On 2021-09-05 13:00, Bart wrote:
    On 05/09/2021 11:43, Dmitry A. Kazakov wrote:
    On 2021-09-05 12:08, Bart wrote:

    In Ada however it doesn't compile.

    Of course it does:

    with Ada.Text_IO; use Ada.Text_IO;
    procedure Test is
    begin
        Put_Line (Long_Long_Long_Integer'Image (18446744073709551615 + 1)); >> end Test;

    Prints:

      18446744073709551616

    No casts needed.

    OK, my last post just crossed yours. I'd figured this out. Except my
    Gnat implementation doesn't support Long_Long_Long (neither does rextester.com).

    Download GNAT Community Edition 2021

    https://www.adacore.com/download

    But even if it did, really? Apart from being uglier than 1ULL, what sort
    of type is that?

    Universal_Integer

    BTW it looks to me like you're having to use an i128 type in order to represent a value that fits into u64.

    No.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Sun Sep 5 13:03:48 2021
    On 2021-09-05 12:52, Bart wrote:
    On 05/09/2021 11:08, Bart wrote:
    On 05/09/2021 10:39, Dmitry A. Kazakov wrote:

    Anyway, isn't it nice to have 1 written as 1 instead of ugly 1ULL?

    In this case, I need to use casts, eg:

         println 18446744073709551615 + u128(1)

    In Ada however it doesn't compile.

    Ada can't display that literal anyway. I can try something like this,
    but only for i64.max not u64.max:

        Put_Line (Long_Long_Integer'Image (9223372036854775807));

    This of course is not ugly at all.

    Isn't it wonderful to just do 'print x', whether x is a named constant, variable or literal, and not worry about what type it happens to be?

    with Ada.Long_Long_Long_Integer_Text_IO;
    use Ada.Long_Long_Long_Integer_Text_IO;

    procedure Test is
    X : Long_Long_Long_Integer := 123;
    Y : constant := 456;
    begin
    Put (18446744073709551615 + 1); -- Expression
    Put (18446744073709551616); -- Literal
    Put (X); -- Variable
    Put (Y); -- Named constant
    end Test;

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Sun Sep 5 12:16:51 2021
    On 05/09/2021 12:11, Dmitry A. Kazakov wrote:

    BTW it looks to me like you're having to use an i128 type in order to
    represent a value that fits into u64.

    No.


    So what type is Long_Long_Integer, and how does it differ from Long_Integer?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Sun Sep 5 14:18:58 2021
    On 2021-09-05 13:16, Bart wrote:
    On 05/09/2021 12:11, Dmitry A. Kazakov wrote:

    BTW it looks to me like you're having to use an i128 type in order to
    represent a value that fits into u64.

    No.

    So what type is Long_Long_Integer, and how does it differ from
    Long_Integer?

    Defined in GNAT Ada package Standard:

    type Long_Integer is range -2**31..2**31-1;
    type Long_Long_Integer is range -2**63..2**63-1;
    type Long_Long_Long_Integer is range -2**128..2**128-1;

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Dmitry A. Kazakov on Sun Sep 5 14:19:56 2021
    On 2021-09-05 14:18, Dmitry A. Kazakov wrote:
    On 2021-09-05 13:16, Bart wrote:
    On 05/09/2021 12:11, Dmitry A. Kazakov wrote:

    BTW it looks to me like you're having to use an i128 type in order
    to represent a value that fits into u64.

    No.

    So what type is Long_Long_Integer, and how does it differ from
    Long_Integer?

    Defined in GNAT Ada package Standard:

       type Long_Integer is range -2**31..2**31-1;
       type Long_Long_Integer is range -2**63..2**63-1;
       type Long_Long_Long_Integer is range -2**128..2**128-1;

    Correction:

    type Long_Long_Long_Integer is range -2**127..2**127-1;

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Sun Sep 5 14:59:02 2021
    On 05/09/2021 13:19, Dmitry A. Kazakov wrote:
    On 2021-09-05 14:18, Dmitry A. Kazakov wrote:
    On 2021-09-05 13:16, Bart wrote:
    On 05/09/2021 12:11, Dmitry A. Kazakov wrote:

    BTW it looks to me like you're having to use an i128 type in order
    to represent a value that fits into u64.

    No.

    So what type is Long_Long_Integer, and how does it differ from
    Long_Integer?

    Defined in GNAT Ada package Standard:

        type Long_Integer is range -2**31..2**31-1;
        type Long_Long_Integer is range -2**63..2**63-1;
        type Long_Long_Long_Integer is range -2**128..2**128-1;

    Correction:

         type Long_Long_Long_Integer is range -2**127..2**127-1;


    So, I was right, it's using a 128-bit type.


    But that 2**127-1 is interesting, assuming this is actual code and not a representation of a built-in type.

    For that to yield 170141183460469231731687303715884105727, the language
    needs to allow open, unencumbered literals, types and expressions (a bit
    like mine!).

    Presumably that is allowed as arguments to 'range', but elsewhere it's
    harder work.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Sun Sep 5 17:10:01 2021
    On 2021-09-05 15:59, Bart wrote:
    On 05/09/2021 13:19, Dmitry A. Kazakov wrote:
    On 2021-09-05 14:18, Dmitry A. Kazakov wrote:
    On 2021-09-05 13:16, Bart wrote:
    On 05/09/2021 12:11, Dmitry A. Kazakov wrote:

    BTW it looks to me like you're having to use an i128 type in order >>>>>> to represent a value that fits into u64.

    No.

    So what type is Long_Long_Integer, and how does it differ from
    Long_Integer?

    Defined in GNAT Ada package Standard:

        type Long_Integer is range -2**31..2**31-1;
        type Long_Long_Integer is range -2**63..2**63-1;
        type Long_Long_Long_Integer is range -2**128..2**128-1;

    Correction:

          type Long_Long_Long_Integer is range -2**127..2**127-1;


    So, I was right, it's using a 128-bit type.

    But that 2**127-1 is interesting, assuming this is actual code and not a representation of a built-in type.

    There is no such distinction in Ada. I could write

    type My_Integer is range -2**127..2**127-1;

    with the same effect. This is the power of a properly designed language.
    Here is the complete program:

    with Ada.Text_IO;
    procedure Test is
    type My_Integer is range -2**127..2**127-1;
    package IO is new Ada.Text_IO.Integer_IO (My_Integer);
    use IO;
    begin
    Put (18446744073709551615 + 1);
    end Test;

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Sun Sep 5 18:13:58 2021
    On 2021-09-05 17:37, Bart wrote:
    On 05/09/2021 16:10, Dmitry A. Kazakov wrote:
    On 2021-09-05 15:59, Bart wrote:
    On 05/09/2021 13:19, Dmitry A. Kazakov wrote:
    On 2021-09-05 14:18, Dmitry A. Kazakov wrote:
    On 2021-09-05 13:16, Bart wrote:
    On 05/09/2021 12:11, Dmitry A. Kazakov wrote:

    BTW it looks to me like you're having to use an i128 type in
    order to represent a value that fits into u64.

    No.

    So what type is Long_Long_Integer, and how does it differ from
    Long_Integer?

    Defined in GNAT Ada package Standard:

        type Long_Integer is range -2**31..2**31-1;
        type Long_Long_Integer is range -2**63..2**63-1;
        type Long_Long_Long_Integer is range -2**128..2**128-1;

    Correction:

          type Long_Long_Long_Integer is range -2**127..2**127-1;


    So, I was right, it's using a 128-bit type.

    But that 2**127-1 is interesting, assuming this is actual code and
    not a representation of a built-in type.

    There is no such distinction in Ada. I could write

        type My_Integer is range -2**127..2**127-1;

    with the same effect. This is the power of a properly designed
    language. Here is the complete program:

    with Ada.Text_IO;
    procedure Test is
        type My_Integer is range -2**127..2**127-1;
        package IO is new Ada.Text_IO.Integer_IO (My_Integer);
        use IO;
    begin
        Put (18446744073709551615 + 1);
    end Test;

    My point was,

    Point or question?

    could you write 2**127 in any other context than a range
    specifier,

    Silly question:

    X : constant := 2**127;

    without having to pedantically define everything about the
    types of the literals involved as well as result types?

    Anyway in this case it is not a range specifier, it a definition of an
    integer type.

    All numeric types, all string types come with literals, naturally.

    type Roman_Digit is ('I', 'V', 'X', 'L', 'C', 'D');
    type Roman_Number is array (Positive range <>) of Roman_Digit;
    Nine : Roman_Number := "IX";

    That is why there is no built-in types in Ada. There are some predefined
    for convenience types.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Sun Sep 5 16:37:07 2021
    On 05/09/2021 16:10, Dmitry A. Kazakov wrote:
    On 2021-09-05 15:59, Bart wrote:
    On 05/09/2021 13:19, Dmitry A. Kazakov wrote:
    On 2021-09-05 14:18, Dmitry A. Kazakov wrote:
    On 2021-09-05 13:16, Bart wrote:
    On 05/09/2021 12:11, Dmitry A. Kazakov wrote:

    BTW it looks to me like you're having to use an i128 type in
    order to represent a value that fits into u64.

    No.

    So what type is Long_Long_Integer, and how does it differ from
    Long_Integer?

    Defined in GNAT Ada package Standard:

        type Long_Integer is range -2**31..2**31-1;
        type Long_Long_Integer is range -2**63..2**63-1;
        type Long_Long_Long_Integer is range -2**128..2**128-1;

    Correction:

          type Long_Long_Long_Integer is range -2**127..2**127-1;


    So, I was right, it's using a 128-bit type.

    But that 2**127-1 is interesting, assuming this is actual code and not
    a representation of a built-in type.

    There is no such distinction in Ada. I could write

       type My_Integer is range -2**127..2**127-1;

    with the same effect. This is the power of a properly designed language.
    Here is the complete program:

    with Ada.Text_IO;
    procedure Test is
       type My_Integer is range -2**127..2**127-1;
       package IO is new Ada.Text_IO.Integer_IO (My_Integer);
       use IO;
    begin
       Put (18446744073709551615 + 1);
    end Test;


    My point was, could you write 2**127 in any other context than a range specifier, without having to pedantically define everything about the
    types of the literals involved as well as result types?

    Or is 'range' special? Although my implementation still limits integers
    to i64.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Sun Sep 5 17:45:26 2021
    On 05/09/2021 17:13, Dmitry A. Kazakov wrote:
    On 2021-09-05 17:37, Bart wrote:
    On 05/09/2021 16:10, Dmitry A. Kazakov wrote:
    On 2021-09-05 15:59, Bart wrote:
    On 05/09/2021 13:19, Dmitry A. Kazakov wrote:
    On 2021-09-05 14:18, Dmitry A. Kazakov wrote:
    On 2021-09-05 13:16, Bart wrote:
    On 05/09/2021 12:11, Dmitry A. Kazakov wrote:

    BTW it looks to me like you're having to use an i128 type in >>>>>>>>> order to represent a value that fits into u64.

    No.

    So what type is Long_Long_Integer, and how does it differ from
    Long_Integer?

    Defined in GNAT Ada package Standard:

        type Long_Integer is range -2**31..2**31-1;
        type Long_Long_Integer is range -2**63..2**63-1;
        type Long_Long_Long_Integer is range -2**128..2**128-1;

    Correction:

          type Long_Long_Long_Integer is range -2**127..2**127-1;


    So, I was right, it's using a 128-bit type.

    But that 2**127-1 is interesting, assuming this is actual code and
    not a representation of a built-in type.

    There is no such distinction in Ada. I could write

        type My_Integer is range -2**127..2**127-1;

    with the same effect. This is the power of a properly designed
    language. Here is the complete program:

    with Ada.Text_IO;
    procedure Test is
        type My_Integer is range -2**127..2**127-1;
        package IO is new Ada.Text_IO.Integer_IO (My_Integer);
        use IO;
    begin
        Put (18446744073709551615 + 1);
    end Test;

    My point was,

    Point or question?

    could you write 2**127 in any other context than a range specifier,

    Silly question:

       X : constant := 2**127;

    Well, the rules keep changing! Here you managed to define X with an i128
    value without needing to also define a long_long_long_integer type. So
    what's the type of X?

    without having to pedantically define everything about the types of
    the literals involved as well as result types?

    Anyway in this case it is not a range specifier, it a definition of an integer type.

    All numeric types, all string types come with literals, naturally.

       type Roman_Digit is ('I', 'V', 'X', 'L', 'C', 'D');
       type Roman_Number is array (Positive range <>) of Roman_Digit;
       Nine : Roman_Number := "IX";

    That is why there is no built-in types in Ada. There are some predefined
    for convenience types.

    Presumably you need to allow built-in literals such as "2", "127", and operations such as "**". Which suggests they are not just for
    convenience, but necessity.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Dmitry A. Kazakov on Sun Sep 5 18:44:52 2021
    On 05/09/2021 11:39, Dmitry A. Kazakov wrote:
    On 2021-09-05 10:54, David Brown wrote:
    On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
    On 2021-09-04 15:39, David Brown wrote:

    Ada also solves this kind of problem by not allowing comparisons
    between
    different types.  (I don't know how it handles literals - that's beyond >>>> my rather limited knowledge of the language.)

    When operations can be overloaded in the result type that simplifies a
    lot. Literals are semantically overloaded parameterless functions. 1
    Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1 My_Custom_Integer
    etc.


    I'm not very keen on overloading in the result type - it feels to me
    that it would be too easy to lose track of what is going on, and to too
    easy to have code that appears identical (same expression, same
    variables, same types, etc.) but completely different effects.

    Why:

       declare
          X : T;
          Y : S;
       begin
          Foo (X);
          Foo (Y);

    is OK, but

       declare
          X : T := Create;
          Y : S := Create;
       begin

    is not?

    (Forgive me if I've misunderstood the Ada syntax here.)

    This is creating new instances of a type - that's not the same as
    overloading on the return type as a general feature. Certainly if you
    first have overloading on return type as a feature of a language, you
    can use that for initialisation or object creation. But it is not
    necessary.


    Anyway, isn't it nice to have 1 written as 1 instead of ugly 1ULL?


    Oh, yes - I fully agree these literals are ugly, and don't use them
    unless they are absolutely necessary. I haven't seen any need, other
    than occasionally "1u". If I have specific need of making a literal
    into a given type, I'll either assign it to a const (or constexpr) of
    that type, or use an explicit cast.

    Same with string and character literals. You just write:

       S1 : String           := "abc"; -- Latin-1
       S2 : Wide_String      := "abc"; -- UCS-2
       S3 : Wide_Wide_String := "abc"; -- UCS-4
    begin
       if S2(3) = 'c' then -- Here 'c' is resolved to 16-bit character       ...


    In C++, string literals with different character types have different
    types, and are written differently. So you can write:

    auto s0 = "abc"; // native char (latin, utf-8, whatever)
    auto s1 = u8"abc"; // utf-8
    auto s2 = u"abc"; // utf-16
    auto s3 = U"abc"; // utf-32


    The fact that literals have a well-defined and fixed type also means you
    can use them with overloads:

    write_string(u8"abc"); // calls write_string(const char8_t*)
    write_string(u"abc"); // calls write_string(const char16_t*)
    write_string(U"abc"); // calls write_string(const char32_t*)


    I must admit I haven't tried result type overloading (I've only played
    very briefly with Ada).  But I'm sceptical.

    It is an arbitrary limitation made by lazy compiler writes (we know some (:-)). Without result overloading they can resolve all types strictly bottom-up.

    C++ has never been a choice for lazy compiler writers!

    I am not a compiler writer - I am a compiler user, and you haven't
    convinced me there are advantages to having result overloading.

    C++ does give you result type overloading, if you want it, via classes
    and conversion operators :

    class A {
    int x_;
    public :
    A(int x) : x_ (x) {}
    operator int8_t() { return x_ + 100; }
    operator int16_t() { return x_ + 200; }
    operator int32_t() { return x_ + 400; }
    };

    int8_t b1 = A(5);
    int16_t b2 = A(5);
    int32_t b3 = A(5);

    Now b1 is 105, b2 is 205, and b3 is 405.

    As far as I can tell, this is a bit more awkward and requires more
    explicit coding than in Ada - that seems fine to me.


    BTW, there is a similar case with overriding. In C++ only the first
    [hidden] argument supports overriding. In Ada terms it is a controlled argument. In Ada any argument and/or the result can be controlled. So
    you could dispatch on the result.


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to Dmitry A. Kazakov on Sun Sep 5 18:56:39 2021
    On 05/09/2021 11:43, Dmitry A. Kazakov wrote:
    On 2021-09-05 12:08, Bart wrote:
    However in other cases it won't; here it displays zero:
         println 18446744073709551615 + 1
    In this case, I need to use casts, eg:
         println 18446744073709551615 + u128(1)

    [Presumably you need to compile/interpret and run this code?]

    [...]
    with Ada.Text_IO; use Ada.Text_IO;
    procedure Test is
    begin
       Put_Line (Long_Long_Long_Integer'Image (18446744073709551615 + 1));
    end Test;
    Prints:
     18446744073709551616
    No casts needed.

    ~$ a68g -p "LONG 18446744073709551615 + 1"
    +18446744073709551616

    No separate compilation/execution needed. You guys seem to like
    writing excessive code. There is a reason why some languages are
    more productive than others!

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Goodban

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to David Brown on Sun Sep 5 20:02:25 2021
    On 2021-09-05 18:44, David Brown wrote:
    On 05/09/2021 11:39, Dmitry A. Kazakov wrote:
    On 2021-09-05 10:54, David Brown wrote:
    On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
    On 2021-09-04 15:39, David Brown wrote:

    Ada also solves this kind of problem by not allowing comparisons
    between
    different types.  (I don't know how it handles literals - that's beyond >>>>> my rather limited knowledge of the language.)

    When operations can be overloaded in the result type that simplifies a >>>> lot. Literals are semantically overloaded parameterless functions. 1
    Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1 My_Custom_Integer
    etc.


    I'm not very keen on overloading in the result type - it feels to me
    that it would be too easy to lose track of what is going on, and to too
    easy to have code that appears identical (same expression, same
    variables, same types, etc.) but completely different effects.

    Why:

       declare
          X : T;
          Y : S;
       begin
          Foo (X);
          Foo (Y);

    is OK, but

       declare
          X : T := Create;
          Y : S := Create;
       begin

    is not?

    (Forgive me if I've misunderstood the Ada syntax here.)

    This is creating new instances of a type - that's not the same as
    overloading on the return type as a general feature. Certainly if you
    first have overloading on return type as a feature of a language, you
    can use that for initialisation or object creation. But it is not
    necessary.

    Overloading is not necessary, only convenient. The argument is that
    there is no logical reason why allow it in arguments and not in results.

    C++ does give you result type overloading, if you want it, via classes
    and conversion operators :

    class A {
    int x_;
    public :
    A(int x) : x_ (x) {}
    operator int8_t() { return x_ + 100; }
    operator int16_t() { return x_ + 200; }
    operator int32_t() { return x_ + 400; }
    };

    int8_t b1 = A(5);
    int16_t b2 = A(5);
    int32_t b3 = A(5);

    This is a slightly different thing.

    But I always admired C++ mechanism of creating ad-hoc subtypes like
    above, in effect:

    A
    |
    int8_t

    If extended it could become a very powerful thing, e.g. to create a
    common sub/supertype for two unrelated types via implicit type conversions.

    A B
    | ^
    V |
    Ad-hoc parent/child

    And then per user-provided implicit conversions:

    A -> ad-hoc parent -> B

    One could call f(B) with A.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Sun Sep 5 19:22:52 2021
    On 2021-09-05 18:45, Bart wrote:
    On 05/09/2021 17:13, Dmitry A. Kazakov wrote:
    On 2021-09-05 17:37, Bart wrote:

    could you write 2**127 in any other context than a range specifier,

    Silly question:

        X : constant := 2**127;

    Well, the rules keep changing!

    You just do not understand them because you cannot think out of the box
    of your language. There are many ways to skin the cat.

    Here you managed to define X with an i128
    value without needing to also define a long_long_long_integer type. So
    what's the type of X?

    Universal_Integer

    without having to pedantically define everything about the types of
    the literals involved as well as result types?

    Anyway in this case it is not a range specifier, it a definition of an
    integer type.

    All numeric types, all string types come with literals, naturally.

        type Roman_Digit is ('I', 'V', 'X', 'L', 'C', 'D');
        type Roman_Number is array (Positive range <>) of Roman_Digit;
        Nine : Roman_Number := "IX";

    That is why there is no built-in types in Ada. There are some
    predefined for convenience types.

    Presumably you need to allow built-in literals such as "2", "127", and operations such as "**". Which suggests they are not just for
    convenience, but necessity.

    Literals, operators, tests, aggregates are a meta constructs. Types are
    not. Ada types are produced by meta operations of types algebra:

    type ... is range ...
    type ... is mod ...
    type ... is digits ...
    type ... is delta ...
    type ... is record ...
    type ... is array ...

    etc.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Andy Walker on Sun Sep 5 21:03:46 2021
    On 05/09/2021 18:56, Andy Walker wrote:
    On 05/09/2021 11:43, Dmitry A. Kazakov wrote:
    On 2021-09-05 12:08, Bart wrote:
    However in other cases it won't; here it displays zero:
         println 18446744073709551615 + 1
    In this case, I need to use casts, eg:
         println 18446744073709551615 + u128(1)

        [Presumably you need to compile/interpret and run this code?]

    That's usually how it's done.

    [...]
    with Ada.Text_IO; use Ada.Text_IO;
    procedure Test is
    begin
        Put_Line (Long_Long_Long_Integer'Image (18446744073709551615 + 1)); >> end Test;
    Prints:
      18446744073709551616
    No casts needed.

      ~$ a68g -p "LONG 18446744073709551615 + 1"
                     +18446744073709551616

    No separate compilation/execution needed.  You guys seem to like
    writing excessive code.  There is a reason why some languages are
    more productive than others!


    You had to use a special option "-p". If I spend 10 minutes adding a
    similar option, then I can do the same:

    C:\qx>qq -p:"print 2L**512-1" 1340780792994259709957402499820584612747936582059239337772356144372176403007354697680187429816690342
    7690031858186486050853753882811946569946433649006084095

    Furthermore, I can use double quotes which A68G had some trouble with:

    C:\qx>qq -p:"print ""hello"""
    hello

    This is actually not unusual; gcc can take input from the console ("-"
    means input is from stdin, but needs -xc to specify language):

    C:\qx>gcc -xc -
    #include <stdio.h>
    int main(void) {puts("Hi there");}
    ^Z

    C:\qx>a
    Hi there

    tcc can do do that, and run the program as well.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Sun Sep 5 20:36:33 2021
    On 05/09/2021 18:22, Dmitry A. Kazakov wrote:
    On 2021-09-05 18:45, Bart wrote:
    On 05/09/2021 17:13, Dmitry A. Kazakov wrote:
    On 2021-09-05 17:37, Bart wrote:

    could you write 2**127 in any other context than a range specifier,

    Silly question:

        X : constant := 2**127;

    Well, the rules keep changing!

    You just do not understand them because you cannot think out of the box
    of your language. There are many ways to skin the cat.

    I got the impression there was only one way, and that cat was called Ada.

    Here you managed to define X with an i128 value without needing to
    also define a long_long_long_integer type. So what's the type of X?

    Universal_Integer

    I tried to create some variables of type 'universal_integer', but it
    said that was undefined.

    So perhaps lurking within Ada's highly restrictive type system, there is
    an open, unrestricted one trying to get out.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Sun Sep 5 22:10:12 2021
    On 2021-09-05 21:36, Bart wrote:
    On 05/09/2021 18:22, Dmitry A. Kazakov wrote:

    Here you managed to define X with an i128 value without needing to
    also define a long_long_long_integer type. So what's the type of X?

    Universal_Integer

    I tried to create some variables of type 'universal_integer', but it
    said that was undefined.

    No object can have a universal type. Named constant is no object.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Sun Sep 5 21:19:10 2021
    On 05/09/2021 21:10, Dmitry A. Kazakov wrote:
    On 2021-09-05 21:36, Bart wrote:
    On 05/09/2021 18:22, Dmitry A. Kazakov wrote:

    Here you managed to define X with an i128 value without needing to
    also define a long_long_long_integer type. So what's the type of X?

    Universal_Integer

    I tried to create some variables of type 'universal_integer', but it
    said that was undefined.

    No object can have a universal type. Named constant is no object.


    So how can I print that X defined as 2**127?

    Remember when I said that can just do 'print x'? Here:

    const x = 2**63-1
    print x

    Displays 9223372036854775807. (I don't support const reduction for i128
    types in the compiler. But then my Ada compiler doesn't support i128 at
    all.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Sun Sep 5 23:16:27 2021
    On 2021-09-05 22:19, Bart wrote:
    On 05/09/2021 21:10, Dmitry A. Kazakov wrote:
    On 2021-09-05 21:36, Bart wrote:
    On 05/09/2021 18:22, Dmitry A. Kazakov wrote:

    Here you managed to define X with an i128 value without needing to
    also define a long_long_long_integer type. So what's the type of X?

    Universal_Integer

    I tried to create some variables of type 'universal_integer', but it
    said that was undefined.

    No object can have a universal type. Named constant is no object.

    So how can I print that X defined as 2**127?

    You need any numeric type that has X as a value, so that you could have
    an proper object. E.g. here I create such a type and then use it Image attribute:

    with Ada.Text_IO; use Ada.Text_IO;
    procedure Test is
    X : constant := 2**127;
    type Large is mod 2**128;
    begin
    Put_Line (Large'Image (X));
    end Test;

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to All on Mon Sep 6 00:17:33 2021
    On 05/09/2021 21:03, Bart wrote:
    [I wrote:]
         [Presumably you need to compile/interpret and run this code?]
    That's usually how it's done.

    I suppose some groups still see a need for smileys?

    [...]
    You had to use a special option "-p".

    Not "special"; a documented feature of A68G.

    If I spend 10 minutes adding a
    similar option, then I can do the same:

    Well, yes; in a few minutes, thee or me could knock up a shell
    script for any sensible language as a wrapper to compile and execute a
    program in that language. More to the point is that your language and
    A68G can have programs of one line where Ada seems to need several plus
    an intimate knowledge of the language to do quite trivial tasks. I
    take Dmitri's [and the Ada community's] point that Ada programs are
    hard to persuade to compile, but once you've done that, your program
    darn well /will/ work; but Ada isn't the only language with that
    property!

    C:\qx>qq -p:"print 2L**512-1" 134078079299425970995740249982058461274793658205923933777235 614437217640300735469768018742981669034276900318581864860508 53753882811946569946433649006084095

    ~$ a68g -precision 150 -p "LONG LONG 2 ^ 512 - 1"
    +134078079299425970995740249982058461274793658205923933
    777235614437217640300735469768018742981669034276900318581864860508
    53753882811946569946433649006084095

    [The default precision is insufficient, so this is slightly harder.]

    Furthermore, I can use double quotes which A68G had some trouble with:

    ???

     C:\qx>qq -p:"print ""hello"""
     hello

    ~$ a68g -p '"hello"'
    hello

    [or

    ~$ a68g -p '"""hello"""'
    "hello"

    if you really want more quotes] -- it isn't A68G having trouble with
    them, but the shell interpreting them before they get to A68G.

    This is actually not unusual; gcc can take input from the console [...].

    Yeah, sure. I run an A68G script every night from "cron" which
    takes as parameter a 20-odd line program that tidies up my music pages
    and generates the "composer of the day", as below. I actually write
    quite few "real" programs these days; I can do most things with shell
    scripts, but COTD needs to do some calculations. I could have written
    a C script of the sort you describe, but A68G was /much/ easier -- and
    shorter. [Other languages are available.]

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Goodban

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Mon Sep 6 10:10:10 2021
    On 05/09/2021 12:08, Bart wrote:
    On 05/09/2021 10:39, Dmitry A. Kazakov wrote:
    On 2021-09-05 10:54, David Brown wrote:
    On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
    On 2021-09-04 15:39, David Brown wrote:

    Ada also solves this kind of problem by not allowing comparisons
    between
    different types.  (I don't know how it handles literals - that's
    beyond
    my rather limited knowledge of the language.)

    When operations can be overloaded in the result type that simplifies a >>>> lot. Literals are semantically overloaded parameterless functions. 1
    Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1
    My_Custom_Integer etc.


    I'm not very keen on overloading in the result type - it feels to me
    that it would be too easy to lose track of what is going on, and to too
    easy to have code that appears identical (same expression, same
    variables, same types, etc.) but completely different effects.

    Why:

        declare
           X : T;
           Y : S;
        begin
           Foo (X);
           Foo (Y);

    is OK, but

        declare
           X : T := Create;
           Y : S := Create;
        begin

    is not?

    Anyway, isn't it nice to have 1 written as 1 instead of ugly 1ULL?

    That is merely a consequence of most C implementations being capped at a 32-bit type.

    No, it is not.

    You rarely have any need to use the suffixes on literals. It typically
    only matters if you are using the literal in an expression where its
    natural type would overflow. So you might write 2 ^ 50 as "1ull << 50".
    But you can happily write "uint64_t x = 12345678901234567890;"

    (Few C implementations have 128-bit fundamental types, and thus no
    128-bit literals, but those are not something you'd need in real code
    anyway.)

    I don't think C's literal handling is perfect, but it works fine in
    practice. It's base handling is not as flexible as Ada's, but on the
    other hand it is not nearly as clumsy and ugly. (But octal constants in
    C are not clumsy and ugly enough.)

    The language allows a wider int.

    I can write this:

        u64 a:=0
        u128 b:=18446744073709551615

        println a + 1
        println b + 1

    Putput is:

        1
        18446744073709551616

    Promotion rules will widen the literal when used in a binary op.

    However in other cases it won't; here it displays zero:

        println 18446744073709551615 + 1


    You complained earlier about how terrible it was for "a + 1" to give 0
    when "a" is an 8-bit unsigned modulo type containing 255. Yet here your language does exactly the same thing, just with 64-bit types. You are
    doing the same thing, just drawing your arbitrary lines in different
    places - unlike Ada which is consistent and clear. (And yes, I know C
    has an equally arbitrary line and that its size is
    implementation-dependent.)


    In this case, I need to use casts, eg:

        println 18446744073709551615 + u128(1)

    In Ada however it doesn't compile.


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Dmitry A. Kazakov on Mon Sep 6 10:20:10 2021
    On 05/09/2021 20:02, Dmitry A. Kazakov wrote:
    On 2021-09-05 18:44, David Brown wrote:
    On 05/09/2021 11:39, Dmitry A. Kazakov wrote:
    On 2021-09-05 10:54, David Brown wrote:
    On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
    On 2021-09-04 15:39, David Brown wrote:

    Ada also solves this kind of problem by not allowing comparisons
    between
    different types.  (I don't know how it handles literals - that's
    beyond
    my rather limited knowledge of the language.)

    When operations can be overloaded in the result type that simplifies a >>>>> lot. Literals are semantically overloaded parameterless functions. 1 >>>>> Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1 My_Custom_Integer >>>>> etc.


    I'm not very keen on overloading in the result type - it feels to me
    that it would be too easy to lose track of what is going on, and to too >>>> easy to have code that appears identical (same expression, same
    variables, same types, etc.) but completely different effects.

    Why:

        declare
           X : T;
           Y : S;
        begin
           Foo (X);
           Foo (Y);

    is OK, but

        declare
           X : T := Create;
           Y : S := Create;
        begin

    is not?

    (Forgive me if I've misunderstood the Ada syntax here.)

    This is creating new instances of a type - that's not the same as
    overloading on the return type as a general feature.  Certainly if you
    first have overloading on return type as a feature of a language, you
    can use that for initialisation or object creation.  But it is not
    necessary.

    Overloading is not necessary, only convenient. The argument is that
    there is no logical reason why allow it in arguments and not in results.

    C++ does give you result type overloading, if you want it, via classes
    and conversion operators :

    class A {
         int x_;
    public :
         A(int x) : x_ (x) {}
         operator int8_t() { return x_ + 100; }
         operator int16_t() { return x_ + 200; }
         operator int32_t() { return x_ + 400; }
    };

    int8_t b1 = A(5);
    int16_t b2 = A(5);
    int32_t b3 = A(5);

    This is a slightly different thing.

    But I always admired C++ mechanism of creating ad-hoc subtypes like
    above, in effect:

        A
        |
      int8_t

    If extended it could become a very powerful thing, e.g. to create a
    common sub/supertype for two unrelated types via implicit type conversions.

            A     B
            |     ^
            V     |
      Ad-hoc parent/child

    And then per user-provided implicit conversions:

       A -> ad-hoc parent -> B

    One could call f(B) with A.


    I think there is plenty of scope for Ada and C++ to learn from each
    other, and copy a few ideas and tricks both ways. There's always things
    that can be done better, or easier, or safer in one language than the other.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Mon Sep 6 11:03:13 2021
    On 06/09/2021 09:10, David Brown wrote:
    On 05/09/2021 12:08, Bart wrote:
    On 05/09/2021 10:39, Dmitry A. Kazakov wrote:

    Anyway, isn't it nice to have 1 written as 1 instead of ugly 1ULL?

    That is merely a consequence of most C implementations being capped at a
    32-bit type.

    No, it is not.

    You rarely have any need to use the suffixes on literals. It typically
    only matters if you are using the literal in an expression where its
    natural type would overflow.

    You need to use it when literals in the range of approx 0..2**32 are
    used within expressions with 64-bit results. Such as common ones like 1
    or 2.

    That's because int is capped at 32 bits on every C I've ever used. A
    64-bit int would mean literals of 0..2**31-1 or 0..2**32-1 having i64 or
    u64 types.


    However in other cases it won't; here it displays zero:

        println 18446744073709551615 + 1


    You complained earlier about how terrible it was for "a + 1" to give 0
    when "a" is an 8-bit unsigned modulo type containing 255. Yet here your language does exactly the same thing, just with 64-bit types.

    Yes, because it hits the upper limit of the machine's integer type;
    there is no way of representing the next value (without the incredibly
    wasteful technique of making all 64-bit operations have 128-bit results,
    and then you'd complain that u128.max + 1 gives 0 too.).

    There is no artificial limit like 255 or 65535. If you're calculating a
    hash expresson from an 8-bit character value, you don't want a
    modulo-256 result just because characters are stored as only 8 bits.

    Didn't you say elsewhere that values requiring between 33 and 64 bits to represent were incredibly rare? You don't want ordinary expressions to
    overflow when those results are a long way from 2**64.


    You are
    doing the same thing, just drawing your arbitrary lines in different
    places - unlike Ada which is consistent and clear.

    One thing I've learned is that Ada is anything but that! In some
    contexts, the ordinary rules go out the window: literals can have any magnitude, results can have any magnitude (apart from being under 2**64
    on my machine, and 2**128 on DAKs), ALL literals and intermediate and
    final results have the same universal integer type.

    But you can't use such a type and such freedoms in any useful contexts.

    (And yes, I know C
    has an equally arbitrary line and that its size is
    implementation-dependent.)

    C's arbitrary line hasn't adapted between the 32 to 64-bit migration as
    it did with 16 to 32 bits.

    Which among other things, means limiting multi-character constants to
    'ABCD'. Mine go up to 'ABCDEFGHIJKLMNOP' (up to 8 chars is 64 bits,
    up to 16 is 128).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Mon Sep 6 12:32:08 2021
    On 2021-09-06 12:03, Bart wrote:
    On 06/09/2021 09:10, David Brown wrote:

    You complained earlier about how terrible it was for "a + 1" to give 0
    when "a" is an 8-bit unsigned modulo type containing 255.  Yet here your
    language does exactly the same thing, just with 64-bit types.

    Yes, because it hits the upper limit of the machine's integer type;
    there is no way of representing the next value (without the incredibly wasteful technique of making all 64-bit operations have 128-bit results,
    and then you'd complain that u128.max + 1 gives 0 too.).

    There is nothing terrible about that. This is legal Ada

    with Ada.Text_IO; use Ada.Text_IO;
    procedure Test is
    begin
    Put_Line (Integer'Image (2**200 / 2**198));
    end Test;

    It prints 4.

    Once you understand types, you will see advantages of not having
    everything one size.

    ALL literals and intermediate and
    final results have the same universal integer type.

    You've got everything wrong.

    1. Constant expressions have a universal type.

    2. Literals have a specific type. You see that if you try to use a
    literal out of range.

    3. Intermediates have the base type (= the machine type used to model
    the user type)

    The model is just same. You have some implementation type do deal with expressions in order to work around arbitrary from the mathematical
    point of view range constraints the ranges impose on the type value.

    The range constraints are checked *only* when the intermediate attempts
    to become a legal type value.

    For named constants, the "machine" type is the universal type, which
    lifts all constraints if you have enough memory available for the compiler.

    For an object, the machine type is the type the compiler thinks is best.
    This is why the rule tells: either correct result or else error. Error
    may happen when the calculation with the selected model type fails. It
    could be a false negative on a 32-bit machine, but fine on a 64-bit
    machine. Ada standard allows false positives in order to provide room
    for optimizations, e.g. selecting a shorter machine type.

    But you can't use such a type and such freedoms in any useful contexts.

    Why, you of course can. T'Base is a proper type. Named constants do not
    require type specification.

    You cannot have objects of universal type because of the rules listed above.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Mon Sep 6 13:14:41 2021
    On 06/09/2021 12:03, Bart wrote:
    On 06/09/2021 09:10, David Brown wrote:
    On 05/09/2021 12:08, Bart wrote:
    On 05/09/2021 10:39, Dmitry A. Kazakov wrote:

    Anyway, isn't it nice to have 1 written as 1 instead of ugly 1ULL?

    That is merely a consequence of most C implementations being capped at a >>> 32-bit type.

    No, it is not.

    You rarely have any need to use the suffixes on literals.  It typically
    only matters if you are using the literal in an expression where its
    natural type would overflow.

    You need to use it when literals in the range of approx 0..2**32 are
    used within expressions with 64-bit results. Such as common ones like 1
    or 2.


    Yes, exactly - that is just what I said, with the numbers filled in.

    That's because int is capped at 32 bits on every C I've ever used. A
    64-bit int would mean literals of 0..2**31-1 or 0..2**32-1 having i64 or
    u64 types.


    However in other cases it won't; here it displays zero:

         println 18446744073709551615 + 1


    You complained earlier about how terrible it was for "a + 1" to give 0
    when "a" is an 8-bit unsigned modulo type containing 255.  Yet here your
    language does exactly the same thing, just with 64-bit types.

    Yes, because it hits the upper limit of the machine's integer type;
    there is no way of representing the next value (without the incredibly wasteful technique of making all 64-bit operations have 128-bit results,
    and then you'd complain that u128.max + 1 gives 0 too.).

    There is no artificial limit like 255 or 65535. If you're calculating a
    hash expresson from an 8-bit character value, you don't want a
    modulo-256 result just because characters are stored as only 8 bits.

    If you are calculating with character values, you are doing things wrong
    to start with. If you are calculating with modulo types such as
    uint8_t, then yes, you /do/ want the result to be modulo 2⁸ because that
    is precisely how modulo types work. These limits are not "artificial",
    they are precisely the limits your code asks for when you use these types.


    Didn't you say elsewhere that values requiring between 33 and 64 bits to represent were incredibly rare? You don't want ordinary expressions to overflow when those results are a long way from 2**64.


    Yes, I did. I don't disagree with the convenience of 64-bit numbers on
    the kinds of systems you target. I am merely pointing out the hypocrisy
    of your disagreeing about Ada wrapping on clearly defined and explicit
    limits while your language wraps on an implicit and arbitrary limit.



     You are
    doing the same thing, just drawing your arbitrary lines in different
    places - unlike Ada which is consistent and clear.

    One thing I've learned is that Ada is anything but that! In some
    contexts, the ordinary rules go out the window: literals can have any magnitude, results can have any magnitude (apart from being under 2**64
    on my machine, and 2**128 on DAKs), ALL literals and intermediate and
    final results have the same universal integer type.


    You need to understand the rules of the language - and it is not a small language. (And I have learned a lot more about them from Dmitry in this group.) But the rules are followed consistently.

    But you can't use such a type and such freedoms in any useful contexts.

    (And yes, I know C
    has an equally arbitrary line and that its size is
    implementation-dependent.)

    C's arbitrary line hasn't adapted between the 32 to 64-bit migration as
    it did with 16 to 32 bits.


    The choice of size to use for "int" has its pros and cons. 64-bit
    reduces the risk of overflow from "very rarely a risk" to "extremely
    rarely a risk". But the cost is more precious L0 cache space, and
    slower code for some operations (especially on earlier 64-bit systems).

    Which among other things, means limiting multi-character constants to
    'ABCD'. Mine go up to 'ABCDEFGHIJKLMNOP' (up to 8 chars is 64 bits,
    up to 16 is 128).


    I have never seen - or even heard of - a sensible use of multi-character constants.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Mon Sep 6 15:26:41 2021
    On 06/09/2021 12:14, David Brown wrote:
    On 06/09/2021 12:03, Bart wrote:

    C's arbitrary line hasn't adapted between the 32 to 64-bit migration as
    it did with 16 to 32 bits.


    The choice of size to use for "int" has its pros and cons. 64-bit
    reduces the risk of overflow from "very rarely a risk" to "extremely
    rarely a risk". But the cost is more precious L0 cache space, and
    slower code for some operations (especially on earlier 64-bit systems).

    You can apply the same argument to 16 vs 32 bits.

    But my use of a default 64 bits applies to in-register calculations;
    registers are 64 bits anyway, and stack slots are 64 bits too.

    So passing or returning 64 bit values it not a real overhead. Using
    individual 'int' variables on the stack frame is not much of one either
    (and many locals will reside in registers anyway).

    So memory issues only come with with arrays and structs, and there you
    can choose to use narrower types.

    But bear in mind also that pointers will be 64 bits anyway, which also
    put pressure on memory.


    Which among other things, means limiting multi-character constants to
    'ABCD'. Mine go up to 'ABCDEFGHIJKLMNOP' (up to 8 chars is 64 bits,
    up to 16 is 128).


    I have never seen - or even heard of - a sensible use of multi-character constants.


    Here's example from my code:

    case a.value
    when 'PROC' then ...
    when 'MODULE' then ...
    when 'CMDNAME' then ...
    ...

    case target
    when 'X64' then ...

    println target:"m" # might display X64

    It's using a 64-bit integer instead of a string, which can be compared
    more easily. You can use it as a quick and dirty enum value too.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Mon Sep 6 23:08:55 2021
    On 06/09/2021 16:26, Bart wrote:
    On 06/09/2021 12:14, David Brown wrote:
    On 06/09/2021 12:03, Bart wrote:

    C's arbitrary line hasn't adapted between the 32 to 64-bit migration as
    it did with 16 to 32 bits.


    The choice of size to use for "int" has its pros and cons.  64-bit
    reduces the risk of overflow from "very rarely a risk" to "extremely
    rarely a risk".  But the cost is more precious L0 cache space, and
    slower code for some operations (especially on earlier 64-bit systems).

    You can apply the same argument to 16 vs 32 bits.

    You can indeed. But while with 32-bit, overflow is "very rarely a
    risk", with 16-bit overflow is "often a risk". It is quite clear that
    32-bit int gives a lot of benefits over 16-bit int - and equally clear
    that the benefits of going to 64-bit are far more marginal. (I'm not
    saying 64-bit int is definitely a bad choice overall, merely that it is
    not necessarily a good choice.)


    But my use of a default 64 bits applies to in-register calculations; registers are 64 bits anyway, and stack slots are 64 bits too.

    So passing or returning 64 bit values it not a real overhead. Using individual 'int' variables on the stack frame is not much of one either
    (and many locals will reside in registers anyway).

    So memory issues only come with with arrays and structs, and there you
    can choose to use narrower types.

    But bear in mind also that pointers will be 64 bits anyway, which also
    put pressure on memory.


    Yes. Linux on x86-64 systems supports a third option for sizes, in
    addition to x86-64 and x86-32. It has "x32", which runs in 64-bit mode
    with access to all the shiny big registers and 64-bit instructions, but
    uses 32-bit pointers. For a lot of code it gives faster programs than
    either normal 32-bit or 64-bit modes.


    Which among other things, means limiting multi-character constants to
    'ABCD'. Mine go up to 'ABCDEFGHIJKLMNOP' (up to 8 chars is 64 bits,
    up to 16 is 128).


    I have never seen - or even heard of - a sensible use of multi-character
    constants.


    Here's example from my code:

        case a.value
        when 'PROC' then ...
        when 'MODULE' then ...
        when 'CMDNAME' then ...
        ...

        case target
        when 'X64' then ...

        println target:"m"   # might display X64

    It's using a 64-bit integer instead of a string, which can be compared
    more easily. You can use it as a quick and dirty enum value too.

    I prefer to use proper enums - there is no advantage to making them
    dirty, as it certainly won't be faster and you lose all the static
    checking benefits of real enumerations (with a good compiler or linter).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Mon Sep 6 23:25:40 2021
    On 06/09/2021 22:08, David Brown wrote:
    On 06/09/2021 16:26, Bart wrote:
    On 06/09/2021 12:14, David Brown wrote:

    I have never seen - or even heard of - a sensible use of multi-character >>> constants.


    Here's example from my code:

        case a.value
        when 'PROC' then ...
        when 'MODULE' then ...
        when 'CMDNAME' then ...
        ...

        case target
        when 'X64' then ...

        println target:"m"   # might display X64

    It's using a 64-bit integer instead of a string, which can be compared
    more easily. You can use it as a quick and dirty enum value too.

    I prefer to use proper enums - there is no advantage to making them
    dirty, as it certainly won't be faster and you lose all the static
    checking benefits of real enumerations (with a good compiler or linter).

    My first example had those values come from outside the program, where
    sharing enums is harder.

    Same when the strings are some sort of input. You can do this stuff with
    single characters like 'I' to act as codes, so why not multi-characters?

    As for efficiency, I knocked up this loop for short string compares. I
    had take care that it wasn't dominated by the random routine:

    static []ichar names=("one","two","three","four")
    int a:=0, b:=0, c:=0, d:=0, n, k
    ichar s

    k:=1
    for i to 100 million do
    if i iand 255 = 0 then k := mrandomrange(1,4) fi
    s := names[k]
    if strcmp(s,"one")=0 then ++a
    elsif strcmp(s,"two")=0 then ++b
    elsif strcmp(s,"three")=0 then ++c;
    elsif strcmp(s,"four")=0 then ++d;
    fi
    od

    println =a,=b,=c,=d

    This took 2.2 seconds. (In optimised C, the equivalent was 1.7 seconds,
    which maybe inlined strcmp; I don't know.)

    This equivalent code using multi-char ints:

    static []int names=('one','two','three','four')
    int a:=0, b:=0, c:=0, d:=0, n, s, k:=1

    for i to 100 million do
    if i iand 255 = 0 then k := mrandomrange(1,4) fi
    s := names[k]
    case s
    when 'one' then ++a
    when 'two' then ++b
    when 'three' then ++c
    when 'four' then ++d
    esac
    od

    took 0.5 seconds.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Thu Sep 9 11:57:59 2021
    On 29/08/2021 22:10, Bart wrote:
    On 29/08/2021 21:17, James Harris wrote:
    On 29/08/2021 20:17, Bart wrote:
    On 29/08/2021 20:02, James Harris wrote:

    ...

    On the topic we were discussing if you have a C parser why not use
    it (along with a C preprocessor) to extract the info you need - such
    a type definitions and structure offsets - from each target
    environment's C header files? That's what I thought you wanted to do
    before.



    Did you read the rest of my post? A few lines down I linked to a file
    which was generated by my compiler. But it cannot do a completely
    automatic translation.

    Yes, I did. I looked at the two files you linked and even the
    definition of the macros which did not convert.

    But translating your own sources is a new topic and was NOT what we
    were talking about.

    Well, you asked why I didn't use my parser to extract that info, but how
    you think that second file got generated!

    I suggested using your parser to /extract/ data, not translate header
    files. And I suggested using it on the C headers of target environments,
    not on your own headers.

    I tend to avoid saying what I would do as each person's goals are
    different but as you keep thinking of something other than what I am
    saying I think I'll have to in this case so as to be as clear as I can.
    If I wanted to get data types and struct layouts for a target
    environment and the only machine-readable description of that
    environment was in C headers I'd

    1. create a .c file with the required #includes
    2. run it through the target's preprocessor
    3. parse the output to extract the data I needed
    4. store the extracted data in a configuration file for the target
    5. use the configuration file to set up my own types and structures for
    the target environment.

    Further, since I may not even have access to a given target environment,
    if the above process was unable to parse anything it needed to I'd have
    the parser produce a report of what it could not handle for sending back
    to me so I could update the parser or take remedial steps.

    As the end of the day, I thought you were lamenting that there's no
    master config file and all info is in C headers. The above steps are
    intended to remedy that and create the master config file I thought you
    wanted.

    ...

    That was the point: other environments do NOT have configuration files
    but they often DO have C headers. To determine the configuration for
    those environments you need to get the info from the C headers. And
    that's best done by

       Cpreprocessor < header | bart_parse env.conf

    where bart_parse is your program which parses the output from the C
    preprocessor and updates env.conf with the required info.

    For this purpose (creating C system header files to go with your own C compiler), you need to end up with an actual set of header files.

    Which existing compilers do you look at for information? Ones like gcc
    have incredibly elaborate headers, full of compiler-specific built-ins
    and attributes and predefined macros.

    Even if a compiler has a hundred versions of stat.h and they depend on a thousand other files it does not matter. All you'd need to do, AISI, is
    run the preprocessor in the correct environment. It will process the
    many headers and produce _one_ copy of the info you need.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Thu Sep 9 12:44:27 2021
    On 29/08/2021 14:11, Dmitry A. Kazakov wrote:
    On 2021-08-29 14:39, James Harris wrote:

    ...

    That would be fine. You could have an OO HLL store

       length, pointer to dispatch table, offset

    You do not need length, because it can be deduced from the tag. Then tag
    is more universal than a pointer to the table. Pointer do not work with multiple-dispatch, for example.

    One could have a type such as String where objects had their own
    lengths; each length would naturally be stored with the object.


    That's probably a better way to explain it (i.e. in terms of
    polymorphism).

    One issue is representation. Each OO language (e.g. Ada and C++) could
    represent such an object differently. The OS would not necessarily
    support the OO layouts of any particular language. And non-OO
    languages would also need access.

    If the OS is built with OOP in mind, wold make the type tags a part of
    its calling convention.

    That may not be practical. Different languages - and even different
    compilers - can implement OOP with different memory layouts. But there
    could be compiler-provided shims which translate between the OOP
    structures of compiled code and the OS.


    It is just history of UNIX and DOS/Windows that they grew out from hobby projects. It need not to be this way. E.g. in VMS all languages used the
    same convention. You could call whatever function from any language
    straight away.

    Do you have a link which shows how that was implemented? I can only find overviews ATM.


    Say you had programs written in Ada, C and C++ all accessing the same
    open file. A change in file position or max offset made by one of them
    would need to be propagate to the other two. The C code could update
    the system structure directly. How would Ada and C++ keep their
    objects in sync?

    Class-wide is only the interface. It will dispatch to the FS
    implementation which will use a specific type.

      C++ unsigned
         |
         V
      API (<unsigned-tag>, unsigned-value)
         |
         | dispatch to NTFS
         V
      NTFS uint64_t (unsigned-value)  converts to the native type
         |
         V
      API (<uint64_t-tag>, uint64_t-value)
         |
         V
      Ada Integer (uint64_t-value)  converts to the desired type


    I am not sure what that diagram is meant to show. (Should the arrows be bidirectional, for example?) But if it is saying how all layers can use
    a uint 64 then that's not surprising.

    Alternatively, if you are saying that each language implementation would
    have its own API/ABI which would allow that language to /model/ OS
    objects but translate them to and from OS structures ... then I would
    agree.

    Incidentally, perhaps your diagram shows the value of having a tag or
    pointer separate from an object's data - something mentioned in the
    "small tuples" thread.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Thu Sep 9 13:43:01 2021
    On 09/09/2021 11:57, James Harris wrote:
    On 29/08/2021 22:10, Bart wrote:

    Well, you asked why I didn't use my parser to extract that info, but
    how you think that second file got generated!

    I suggested using your parser to /extract/ data, not translate header
    files. And I suggested using it on the C headers of target environments,
    not on your own headers.

    I tend to avoid saying what I would do as each person's goals are
    different but as you keep thinking of something other than what I am
    saying I think I'll have to in this case so as to be as clear as I can.
    If I wanted to get data types and struct layouts for a target
    environment and the only machine-readable description of that
    environment was in C headers I'd

    1. create a .c file with the required #includes
    2. run it through the target's preprocessor
    3. parse the output to extract the data I needed
    4. store the extracted data in a configuration file for the target
    5. use the configuration file to set up my own types and structures for
    the target environment.

    Further, since I may not even have access to a given target environment,
    if the above process was unable to parse anything it needed to I'd have
    the parser produce a report of what it could not handle for sending back
    to me so I could update the parser or take remedial steps.

    As the end of the day, I thought you were lamenting that there's no
    master config file and all info is in C headers. The above steps are
    intended to remedy that and create the master config file I thought you wanted.

    In general, the process is non-trivial, even if you have a C compiler
    that can successfully process the headers (which itself can be
    problematical as it can have extra dependencies).

    For example, at some point, something is defined in terms of C
    executable code, and not declarations. Now you are having to translate
    chunks of program code.

    It is also an unreasonable degree of effort. I think most people would
    rather buy fish in a supermarket than having to lease a North Sea
    trawler to find it themselves!




    ...

    That was the point: other environments do NOT have configuration
    files but they often DO have C headers. To determine the
    configuration for those environments you need to get the info from
    the C headers. And that's best done by

       Cpreprocessor < header | bart_parse env.conf

    where bart_parse is your program which parses the output from the C
    preprocessor and updates env.conf with the required info.

    For this purpose (creating C system header files to go with your own C
    compiler), you need to end up with an actual set of header files.

    Which existing compilers do you look at for information? Ones like gcc
    have incredibly elaborate headers, full of compiler-specific built-ins
    and attributes and predefined macros.

    Even if a compiler has a hundred versions of stat.h and they depend on a thousand other files it does not matter. All you'd need to do, AISI, is
    run the preprocessor in the correct environment. It will process the
    many headers and produce _one_ copy of the info you need.


    It's funny but, if your starting point is .h file AND a .dll binary,
    that exact process will have already been gone through in order to
    generate that specific binary.

    So why wasn't it also possible to have it generate a less
    language-specific set of information, which like the DLL itself, will
    also be self-contained in one file.

    (I do exactly this; while the file I generate is specific to my
    languuage, it would be straightforward to generate a more universal
    format, or a simplified, struct subset of C even, but as one .h with ALL
    the info needed.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Thu Sep 9 16:14:43 2021
    On 2021-09-09 13:44, James Harris wrote:
    On 29/08/2021 14:11, Dmitry A. Kazakov wrote:

    If the OS is built with OOP in mind, wold make the type tags a part of
    its calling convention.

    That may not be practical. Different languages - and even different
    compilers - can implement OOP with different memory layouts.

    Irrelevant. OS defines its API, conventions, representations. What else
    a language does, OS does not care.

    It is just history of UNIX and DOS/Windows that they grew out from
    hobby projects. It need not to be this way. E.g. in VMS all languages
    used the same convention. You could call whatever function from any
    language straight away.

    Do you have a link which shows how that was implemented? I can only find overviews ATM.

    That would not help you, because VAX-11 and ALpha architectures are unfortunately dead.

    Say you had programs written in Ada, C and C++ all accessing the same
    open file. A change in file position or max offset made by one of
    them would need to be propagate to the other two. The C code could
    update the system structure directly. How would Ada and C++ keep
    their objects in sync?

    Class-wide is only the interface. It will dispatch to the FS
    implementation which will use a specific type.

       C++ unsigned
          |
          V
       API (<unsigned-tag>, unsigned-value)
          |
          | dispatch to NTFS
          V
       NTFS uint64_t (unsigned-value)  converts to the native type
          |
          V
       API (<uint64_t-tag>, uint64_t-value)
          |
          V
       Ada Integer (uint64_t-value)  converts to the desired type

    I am not sure what that diagram is meant to show. (Should the arrows be bidirectional, for example?)

    Nope, you said C++ writes value and Ada reads it.

    Alternatively, if you are saying that each language implementation would
    have its own API/ABI which would allow that language to /model/ OS
    objects but translate them to and from OS structures ...  then I would agree.

    Nope, the convention is exactly same.

    Incidentally, perhaps your diagram shows the value of having a tag or
    pointer separate from an object's data - something mentioned in the
    "small tuples" thread.

    Right, because unsigned has no room to store the tag.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Thu Sep 9 18:00:52 2021
    On 09/09/2021 15:14, Dmitry A. Kazakov wrote:
    On 2021-09-09 13:44, James Harris wrote:
    On 29/08/2021 14:11, Dmitry A. Kazakov wrote:

    If the OS is built with OOP in mind, wold make the type tags a part
    of its calling convention.

    That may not be practical. Different languages - and even different
    compilers - can implement OOP with different memory layouts.

    Irrelevant. OS defines its API, conventions, representations. What else
    a language does, OS does not care.

    Which would appear to be the the point I was making to you.


    It is just history of UNIX and DOS/Windows that they grew out from
    hobby projects. It need not to be this way. E.g. in VMS all languages
    used the same convention. You could call whatever function from any
    language straight away.

    Do you have a link which shows how that was implemented? I can only
    find overviews ATM.

    That would not help you, because VAX-11 and ALpha architectures are unfortunately dead.

    No worries. But just because a system is no longer in use does not mean
    that it contained only bad ideas.


    Say you had programs written in Ada, C and C++ all accessing the
    same open file. A change in file position or max offset made by one
    of them would need to be propagate to the other two. The C code
    could update the system structure directly. How would Ada and C++
    keep their objects in sync?

    Class-wide is only the interface. It will dispatch to the FS
    implementation which will use a specific type.

       C++ unsigned
          |
          V
       API (<unsigned-tag>, unsigned-value)
          |
          | dispatch to NTFS
          V
       NTFS uint64_t (unsigned-value)  converts to the native type
          |
          V
       API (<uint64_t-tag>, uint64_t-value)
          |
          V
       Ada Integer (uint64_t-value)  converts to the desired type

    I am not sure what that diagram is meant to show. (Should the arrows
    be bidirectional, for example?)

    Nope, you said C++ writes value and Ada reads it.

    Nope, I said they both access the file.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Fri Sep 10 08:51:16 2021
    On 2021-09-09 19:00, James Harris wrote:
    On 09/09/2021 15:14, Dmitry A. Kazakov wrote:
    On 2021-09-09 13:44, James Harris wrote:
    On 29/08/2021 14:11, Dmitry A. Kazakov wrote:

    It is just history of UNIX and DOS/Windows that they grew out from
    hobby projects. It need not to be this way. E.g. in VMS all
    languages used the same convention. You could call whatever function
    from any language straight away.

    Do you have a link which shows how that was implemented? I can only
    find overviews ATM.

    That would not help you, because VAX-11 and ALpha architectures are
    unfortunately dead.

    No worries. But just because a system is no longer in use does not mean
    that it contained only bad ideas.

    The opposite, systems continue to exist only because of bad ideas. It is
    a negative selection. Examples: Windows, Unix.

    BTW, Windows NT kernel ripped off some stuff from VMS, but Bill in his
    infinite wisdom managed to hide it so deep that the bright facade of
    Windows being MS-DOS remained untarnished.

    http://digitronics.com/OpenVMS/Documentation/OpenVMS%20Documentation/ovms_73_call_stand.pdf

    Say you had programs written in Ada, C and C++ all accessing the
    same open file. A change in file position or max offset made by one
    of them would need to be propagate to the other two. The C code
    could update the system structure directly. How would Ada and C++
    keep their objects in sync?

    Class-wide is only the interface. It will dispatch to the FS
    implementation which will use a specific type.

       C++ unsigned
          |
          V
       API (<unsigned-tag>, unsigned-value)
          |
          | dispatch to NTFS
          V
       NTFS uint64_t (unsigned-value)  converts to the native type
          |
          V
       API (<uint64_t-tag>, uint64_t-value)
          |
          V
       Ada Integer (uint64_t-value)  converts to the desired type

    I am not sure what that diagram is meant to show. (Should the arrows
    be bidirectional, for example?)

    Nope, you said C++ writes value and Ada reads it.

    Nope, I said they both access the file.

    The file size: C++ sets it, Ada reads it back.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Sat Sep 25 19:38:55 2021
    On 02/09/2021 18:12, Dmitry A. Kazakov wrote:
    On 2021-09-02 18:47, James Harris wrote:
    On 02/09/2021 17:01, Bart wrote:
    On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
    On 2021-09-02 14:50, Bart wrote:

    ...

    A fourth might be that arithmetic operations may not work directly
    on all of those (but on x64 they do).

    And?

    OK. So according to you these types are no problem at all. You can
    use a u17 type just as easily as a 16-bit or 32-bit type.

    Perhaps you'd like to show some actual assembly code then this fragment: >>>
        u17 a,b,c
        a := b + c

    I'd be particularly interested in how a,b,c are laid out in memory.

    I'd be interested to see Dmitry's assembly code - but I suspect he'll
    not answer that part.

    Why do you want it?

    I didn't say I wanted it. I said I'd be interested to see it as I
    suspected you'd not answer that bit. I was right, wasn't I!


    What is supposed to happen on overflow

    What overflow? It is a modular number, they do never overflow. It is up
    to you to implement arithmetic correctly using appropriate instructions.

    My thinking was that if Bart is setting the problem he is at liberty to
    define the parameters thereof. He did not need to follow your preferred
    concept of unsigned numbers.

    In fact, I asked him what he meant because using modular arithmetic
    looked too easy.


    and are there any particular optimisation goals?

    When doing modular arithmetic you must minimize checks by proving that
    the intermediates are correct regardless the arguments. Say, you decided
    to implement arithmetic using 32-bit machine numbers. Then with b+c you
    have nothing to worry about. You load b and c into 32-bit registers, you
    sum them. Then you verify if the result is greater than 2**17-1, if yes,
    you subtract 2**17. Difficult?


    Again, that looked too easy. I suspected Bart had some other criteria in
    mind.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Sat Sep 25 20:49:59 2021
    On 2021-09-25 20:38, James Harris wrote:

    My thinking was that if Bart is setting the problem he is at liberty to define the parameters thereof.

    No, he must stay within the rational framework. Redefining mathematics
    is irrational.

    He did not need to follow your preferred
    concept of unsigned numbers.

    Different categories of numbers exist independently of any preferences.
    He must say what he means in commonly accepted terms. That will clarify
    and determine everything.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Sat Sep 25 20:33:39 2021
    On 02/09/2021 19:56, David Brown wrote:
    On 02/09/2021 18:47, James Harris wrote:
    On 02/09/2021 17:01, Bart wrote:

    ...

    Perhaps you'd like to show some actual assembly code then this fragment: >>>
        u17 a,b,c
        a := b + c

    I'd be particularly interested in how a,b,c are laid out in memory.

    I'd be interested to see Dmitry's assembly code - but I suspect he'll
    not answer that part.

    Presumably it is roughly the same as you'd get in C with :

    uint32_t a, b, c;

    a = (b + c) % (1u << 17);

    Yes, could be.


    Types in a high level language are not constructs in assembly. They
    don't have to correspond to matching hardware or assembly-level
    features. Just as a "bool" or an enumerated type in C is going to be
    stored in a register or memory in exactly the same way as a the
    processor might store a number, so a "u17" type (assuming that means a 0
    .. 2 ^ 17 - 1 modulo type) will be stored in the same way as some
    integer. That's likely to be the same storage as a uint32_t. (Though
    one processor I use has 20-bit registers, which would be more efficient
    than using two of its 16-bit registers.)

    That's weird! Which processor is it?


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Sun Sep 26 01:09:53 2021
    On 25/09/2021 19:38, James Harris wrote:
    On 02/09/2021 18:12, Dmitry A. Kazakov wrote:
    On 2021-09-02 18:47, James Harris wrote:
    On 02/09/2021 17:01, Bart wrote:
    On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
    On 2021-09-02 14:50, Bart wrote:

    ...

    A fourth might be that arithmetic operations may not work directly >>>>>> on all of those (but on x64 they do).

    And?

    OK. So according to you these types are no problem at all. You can
    use a u17 type just as easily as a 16-bit or 32-bit type.

    Perhaps you'd like to show some actual assembly code then this
    fragment:

        u17 a,b,c
        a := b + c

    I'd be particularly interested in how a,b,c are laid out in memory.

    I'd be interested to see Dmitry's assembly code - but I suspect he'll
    not answer that part.

    Why do you want it?

    I didn't say I wanted it. I said I'd be interested to see it as I
    suspected you'd not answer that bit. I was right, wasn't I!


    Apparently the next C version will have bit-specific integers of
    arbitrary widths, both in the 1-64 range and well beyond:

    http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2709.pdf

    Although it will probably be optional, and largely targeted at
    specialised hardware which support for such types.


    What is supposed to happen on overflow

    What overflow? It is a modular number, they do never overflow. It is
    up to you to implement arithmetic correctly using appropriate
    instructions.

    My thinking was that if Bart is setting the problem he is at liberty to define the parameters thereof. He did not need to follow your preferred concept of unsigned numbers.

    In fact, I asked him what he meant because using modular arithmetic
    looked too easy.


    and are there any particular optimisation goals?

    When doing modular arithmetic you must minimize checks by proving that
    the intermediates are correct regardless the arguments. Say, you
    decided to implement arithmetic using 32-bit machine numbers. Then
    with b+c you have nothing to worry about. You load b and c into 32-bit
    registers, you sum them. Then you verify if the result is greater than
    2**17-1, if yes, you subtract 2**17. Difficult?


    Again, that looked too easy. I suspected Bart had some other criteria in mind.

    Talking about it is always going to be easier than actually doing it.

    The difficulty is more with loading such values, especially using
    pointers and arrays, and storing them again. If you compare 17- or 43-
    or 1743-bit 'Bitfield' types with 8-, 16- and 32-bit 'Short' types that
    already give me enough trouble:

    (a) Short types can use a regular pointer type; Bitfield types need to
    point inside a byte or word as the field may not start at bit 0 (I
    assume the width is a fixed part of the type)

    (b) Short types of 16/32 bits will generally be stored properly aligned. Bitfields may cross byte/word boundaries

    (c) Short types can be loaded in one memory read, and stored in one
    memory write, without also loading or disturbing adjacent bits or bytes. Bitfield may need several reads or writes; ones over 64-bits need
    specialist handling.

    (d) Sign-extension instructions already exist for 8/16/32-bit types.


    (I do do some bitfield ops, but I get past these problems by:

    (a) I don't have pointers to bitfields

    (b) They are contained within an 8/16/32/64-bit regular word which is
    aligned ...

    (c) ... and it is that word that is loaded and stored as a whole after extracting and re-inserting the bitfield

    (d) My bitfields are unsigned and the extraction processes adds the
    zeros anyway

    (e) (bonus one) One in a register, I no longer need to care what
    bitfield /type/ was; it doesn't have its own type. Its widened to 64
    bits, operated on as 64 bits, and truncated before storing.

    If I've said all this before, then never mind!)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Sun Sep 26 13:50:17 2021
    On 25/09/2021 21:33, James Harris wrote:
    On 02/09/2021 19:56, David Brown wrote:
    On 02/09/2021 18:47, James Harris wrote:
    On 02/09/2021 17:01, Bart wrote:

    ...

    Perhaps you'd like to show some actual assembly code then this
    fragment:

         u17 a,b,c
         a := b + c

    I'd be particularly interested in how a,b,c are laid out in memory.

    I'd be interested to see Dmitry's assembly code - but I suspect he'll
    not answer that part.

    Presumably it is roughly the same as you'd get in C with :

        uint32_t a, b, c;

        a = (b + c) % (1u << 17);

    Yes, could be.


    Types in a high level language are not constructs in assembly.  They
    don't have to correspond to matching hardware or assembly-level
    features.  Just as a "bool" or an enumerated type in C is going to be
    stored in a register or memory in exactly the same way as a the
    processor might store a number, so a "u17" type (assuming that means a 0
    .. 2 ^ 17 - 1 modulo type) will be stored in the same way as some
    integer.  That's likely to be the same storage as a uint32_t.  (Though
    one processor I use has 20-bit registers, which would be more efficient
    than using two of its 16-bit registers.)

    That's weird! Which processor is it?


    The msp430X microcontroller family is primarily 16-bit (and has a 16-bit
    ALU), but supports 20-bit addresses and has 20-bit registers.

    There are templated cores - primarily for DSP - where the size of the
    registers and datapaths are specified when the hardware design is
    generated. There are processors that have primarily 16-bit or 32-bit registers, but have extra wide ones at 20-bit or 40-bit for MAC
    operations. Lots of possibilities.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Fri Oct 1 18:54:53 2021
    On 25/09/2021 19:49, Dmitry A. Kazakov wrote:
    On 2021-09-25 20:38, James Harris wrote:

    My thinking was that if Bart is setting the problem he is at liberty
    to define the parameters thereof.

    No, he must stay within the rational framework. Redefining mathematics
    is irrational.

    AIUI Bart specified unsigned numbers, not modular arithmetic.

    Besides, computing is related to but not the same as mathematics.


    He did not need to follow your preferred concept of unsigned numbers.

    Different categories of numbers exist independently of any preferences.
    He must say what he means in commonly accepted terms. That will clarify
    and determine everything.


    Hence my request for specification.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Fri Oct 1 20:54:49 2021
    On 2021-10-01 19:54, James Harris wrote:
    On 25/09/2021 19:49, Dmitry A. Kazakov wrote:
    On 2021-09-25 20:38, James Harris wrote:

    My thinking was that if Bart is setting the problem he is at liberty
    to define the parameters thereof.

    No, he must stay within the rational framework. Redefining mathematics
    is irrational.

    AIUI Bart specified unsigned numbers, not modular arithmetic.

    There is no such thing. He might mean "natural" numbers, then he should
    have said so. Using the term "unsigned" is misguided because it is C's nomenclature. There is nothing natural in C... (:-))

    1. There is the set N of natural numbers which is a subset of the set of integer numbers Z. There is nothing special about N that could
    distinguish it from any other subrange of Z, like 100..1000. Implement
    ranges and be done with that.

    2. There are modular numbers of modulo K=1, 2, ...

    Implementation of natural numbers using modular machine arithmetic for
    one quite idiotic purpose to squeeze one more bit of representation is possible, but expensive. This is why reasonable languages do not bother
    with that mess. They just provide modular numbers of modulo 2**K using
    machine instructions. It is as simple and efficient as shoe polish.

    All this is especially bizzare because he is ready to drop all small
    numeric types and go full 64-bit on 16-bit microcontrollers. Come on!

    Besides, computing is related to but not the same as mathematics.

    Computing deploys exactly same mathematics. There is only one.

    He did not need to follow your preferred concept of unsigned numbers.

    Different categories of numbers exist independently of any
    preferences. He must say what he means in commonly accepted terms.
    That will clarify and determine everything.

    Hence my request for specification.

    Good. See above.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Fri Oct 1 21:19:06 2021
    On 01/10/2021 19:54, Dmitry A. Kazakov wrote:
    On 2021-10-01 19:54, James Harris wrote:
    On 25/09/2021 19:49, Dmitry A. Kazakov wrote:
    On 2021-09-25 20:38, James Harris wrote:

    My thinking was that if Bart is setting the problem he is at liberty
    to define the parameters thereof.

    No, he must stay within the rational framework. Redefining
    mathematics is irrational.

    AIUI Bart specified unsigned numbers, not modular arithmetic.

    There is no such thing. He might mean "natural" numbers, then he should
    have said so. Using the term "unsigned" is misguided because it is C's nomenclature. There is nothing natural in C... (:-))

    Fortunately I'm not a mathematician so can get on with things without
    getting bogged down with such stuff.

    1. There is the set N of natural numbers which is a subset of the set of integer numbers Z. There is nothing special about N that could
    distinguish it from any other subrange of Z, like 100..1000. Implement
    ranges and be done with that.

    2. There are modular numbers of modulo K=1, 2, ...

    Implementation of natural numbers using modular machine arithmetic for
    one quite idiotic purpose to squeeze one more bit of representation is possible, but expensive.

    That one extra bit is important simply because a lot of the things I
    want to represent, which are related to hardware or languages or file
    formats etc, happen to lie in that extra range, such as 128..255.

    It can mean being able to do more with N bits rather than the next size
    up which is 2N bits.

    So you will commonly find that image and display formats represent the intensity of a colour with a value from 0 to 255 - 8 bits unsigned.

    Below is one of the records to do with PE file format. Notice every
    field is u8 u16 or u32, except Imagedir, which consists of two u32 fields.

    This stuff is everywhere; are we supposed to just ignore it?

    Note that within expressions, a single i64 type can represent the ranges
    of all narrower types including u8 u16 u32; only u64 is outside its
    range, and only for values of 2**63 and above.

    This is why it's just a useful default integer type.

    This is why reasonable languages do not bother
    with that mess. They just provide modular numbers of modulo 2**K using machine instructions. It is as simple and efficient as shoe polish.

    All this is especially bizzare because he is ready to drop all small
    numeric types and go full 64-bit on 16-bit microcontrollers. Come on!

    Where did I say that? These are the 'int' sizes I've used on all
    processors I've written compilers for, for for my languages:

    PDP10 36 bits (36-bit word size; not my language for this one)
    Z80 16 bits (8-bit word size)
    8086 16 bits (16-bit word size, also 80386 in 16-bit mode)
    80386 etc 32 bits (32-bit word size in 32-bit mode)
    x64 64 bits (64-bit word size)

    Do you think that, if I adapted my language for a smaller device, I
    would insist on a 64-bit int size?

    ---------------------

    record optionalheader =
    word16 magic
    byte majorlv
    byte minorlv
    word32 codesize
    word32 idatasize
    word32 zdatasize
    word32 entrypoint
    word32 codebase
    word64 imagebase
    word32 sectionalignment
    word32 filealignment
    word16 majorosv
    word16 minorosv
    word16 majorimagev
    word16 minorimagev
    word16 majorssv
    word16 minorssv
    word32 win32version
    word32 imagesize
    word32 headerssize
    word32 checksum
    word16 subsystem
    word16 dllcharacteristics
    word64 stackreserve
    word64 stackcommit
    word64 heapreserve
    word64 heapcommit
    word32 loaderflags
    word32 rvadims
    imagedir exporttable
    imagedir importtable
    ....
    end

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Sun Oct 3 08:00:58 2021
    On 01/10/2021 21:19, Bart wrote:
    On 01/10/2021 19:54, Dmitry A. Kazakov wrote:
    On 2021-10-01 19:54, James Harris wrote:
    On 25/09/2021 19:49, Dmitry A. Kazakov wrote:
    On 2021-09-25 20:38, James Harris wrote:

    My thinking was that if Bart is setting the problem he is at
    liberty to define the parameters thereof.

    No, he must stay within the rational framework. Redefining
    mathematics is irrational.

    AIUI Bart specified unsigned numbers, not modular arithmetic.

    There is no such thing. He might mean "natural" numbers, then he
    should have said so. Using the term "unsigned" is misguided because it
    is C's nomenclature. There is nothing natural in C... (:-))

    Fortunately I'm not a mathematician so can get on with things without
    getting bogged down with such stuff.

    Be careful not to accept Dmitry's beration! Programming is not
    mathematics: it's both a superset and a subset thereof. Where we use
    small integers mathematics and programming are consistent and that leads
    to false expectations of wider harmony but that's misleading when values approach the limits of a certain type. Your use of "unsigned" is better
    than Dmitry's "natural" for that reason. Natural integers are a
    theoretical concept useful in mathematics. By contrast, unsigned
    integers, because they are familiar from their use in C and especially
    if we can assume a certain representation and, thus, behaviour, are an engineering concept better suited to programming. As a consequence IMO
    they can help towards a clear specification.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Sun Oct 3 20:06:01 2021
    On 03/10/2021 09:00, James Harris wrote:
    On 01/10/2021 21:19, Bart wrote:
    On 01/10/2021 19:54, Dmitry A. Kazakov wrote:
    On 2021-10-01 19:54, James Harris wrote:
    On 25/09/2021 19:49, Dmitry A. Kazakov wrote:
    On 2021-09-25 20:38, James Harris wrote:

    My thinking was that if Bart is setting the problem he is at
    liberty to define the parameters thereof.

    No, he must stay within the rational framework. Redefining
    mathematics is irrational.

    AIUI Bart specified unsigned numbers, not modular arithmetic.

    There is no such thing. He might mean "natural" numbers, then he
    should have said so. Using the term "unsigned" is misguided because
    it is C's nomenclature. There is nothing natural in C... (:-))

    Fortunately I'm not a mathematician so can get on with things without
    getting bogged down with such stuff.

    Be careful not to accept Dmitry's beration! Programming is not
    mathematics: it's both a superset and a subset thereof. Where we use
    small integers mathematics and programming are consistent and that leads
    to false expectations of wider harmony but that's misleading when values approach the limits of a certain type. Your use of "unsigned" is better
    than Dmitry's "natural" for that reason. Natural integers are a
    theoretical concept useful in mathematics. By contrast, unsigned
    integers, because they are familiar from their use in C and especially
    if we can assume a certain representation and, thus, behaviour, are an engineering concept better suited to programming. As a consequence IMO
    they can help towards a clear specification.


    I'm sorry, but that is just wrong. The only explanation I have is that
    you think "mathematics" means "arithmetic I learned in primary school".
    The way integer types work in programming is all defined
    mathematically. It doesn't matter whether you are looking at
    implementations in terms of bits and representations, or the higher
    level usage of the integers. It doesn't matter if you are talking
    signed, unsigned, wrapping, overflowing, saturating, or anything else -
    it's all mathematics. Even the concept of undefined behaviour is
    solidly embedded in mathematics.

    Obviously some of the behaviours and characteristics of the integer
    types in any real programming language will differ from those of
    everyday numbers (mostly due to size limits), but that does not mean
    they are not defined mathematically.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Sun Oct 3 19:27:13 2021
    On 03/10/2021 19:06, David Brown wrote:
    On 03/10/2021 09:00, James Harris wrote:
    On 01/10/2021 21:19, Bart wrote:
    On 01/10/2021 19:54, Dmitry A. Kazakov wrote:
    On 2021-10-01 19:54, James Harris wrote:
    On 25/09/2021 19:49, Dmitry A. Kazakov wrote:
    On 2021-09-25 20:38, James Harris wrote:

    My thinking was that if Bart is setting the problem he is at
    liberty to define the parameters thereof.

    No, he must stay within the rational framework. Redefining
    mathematics is irrational.

    AIUI Bart specified unsigned numbers, not modular arithmetic.

    There is no such thing. He might mean "natural" numbers, then he
    should have said so. Using the term "unsigned" is misguided because
    it is C's nomenclature. There is nothing natural in C... (:-))

    Fortunately I'm not a mathematician so can get on with things without
    getting bogged down with such stuff.

    Be careful not to accept Dmitry's beration! Programming is not
    mathematics: it's both a superset and a subset thereof. Where we use
    small integers mathematics and programming are consistent and that leads
    to false expectations of wider harmony but that's misleading when values
    approach the limits of a certain type. Your use of "unsigned" is better
    than Dmitry's "natural" for that reason. Natural integers are a
    theoretical concept useful in mathematics. By contrast, unsigned
    integers, because they are familiar from their use in C and especially
    if we can assume a certain representation and, thus, behaviour, are an
    engineering concept better suited to programming. As a consequence IMO
    they can help towards a clear specification.


    I'm sorry, but that is just wrong. The only explanation I have is that
    you think "mathematics" means "arithmetic I learned in primary school".

    And what's wrong with that? AFAICS arithmetic is exactly what a
    processor is designed to do.

    It's no different from doing sums with an abacus which uses a fixed
    number of digits.

    How do signed, unsigned, wrapping, overflow, saturating etc work with an abacus?

    The way integer types work in programming is all defined
    mathematically.

    How they work in a processor depends on the ALU design. Most I've looked
    at work the same way. Departing significantly from that in a language
    can be expensive as you'd have to emulate the behaviour.

    It doesn't matter whether you are looking at
    implementations in terms of bits and representations, or the higher
    level usage of the integers. It doesn't matter if you are talking
    signed, unsigned, wrapping, overflowing, saturating, or anything else -
    it's all mathematics. Even the concept of undefined behaviour is
    solidly embedded in mathematics.

    Personally I'd prefer if mathematics kept its nose out of things.
    Especially in programming language design with its type theory and
    lambda calculus and all the the rest.

    You only end up with exotic FP languages that only professors of
    computer science can understand.

    Obviously some of the behaviours and characteristics of the integer
    types in any real programming language will differ from those of
    everyday numbers (mostly due to size limits), but that does not mean
    they are not defined mathematically.


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Sun Oct 3 21:20:04 2021
    On 03/10/2021 19:06, David Brown wrote:
    On 03/10/2021 09:00, James Harris wrote:
    On 01/10/2021 21:19, Bart wrote:
    On 01/10/2021 19:54, Dmitry A. Kazakov wrote:
    On 2021-10-01 19:54, James Harris wrote:
    On 25/09/2021 19:49, Dmitry A. Kazakov wrote:
    On 2021-09-25 20:38, James Harris wrote:

    My thinking was that if Bart is setting the problem he is at
    liberty to define the parameters thereof.

    No, he must stay within the rational framework. Redefining
    mathematics is irrational.

    AIUI Bart specified unsigned numbers, not modular arithmetic.

    There is no such thing. He might mean "natural" numbers, then he
    should have said so. Using the term "unsigned" is misguided because
    it is C's nomenclature. There is nothing natural in C... (:-))

    Fortunately I'm not a mathematician so can get on with things without
    getting bogged down with such stuff.

    Be careful not to accept Dmitry's beration! Programming is not
    mathematics: it's both a superset and a subset thereof. Where we use
    small integers mathematics and programming are consistent and that leads
    to false expectations of wider harmony but that's misleading when values
    approach the limits of a certain type. Your use of "unsigned" is better
    than Dmitry's "natural" for that reason. Natural integers are a
    theoretical concept useful in mathematics. By contrast, unsigned
    integers, because they are familiar from their use in C and especially
    if we can assume a certain representation and, thus, behaviour, are an
    engineering concept better suited to programming. As a consequence IMO
    they can help towards a clear specification.


    I'm sorry, but that is just wrong. The only explanation I have is that
    you think "mathematics" means "arithmetic I learned in primary school".

    No need to apologise! Disagreement is the grist of discussion!


    The way integer types work in programming is all defined
    mathematically. It doesn't matter whether you are looking at
    implementations in terms of bits and representations, or the higher
    level usage of the integers. It doesn't matter if you are talking
    signed, unsigned, wrapping, overflowing, saturating, or anything else -
    it's all mathematics. Even the concept of undefined behaviour is
    solidly embedded in mathematics.

    I have to disagree. For sure, one can express computer arithmetic in mathematical terms but because of the limits imposed by fixed
    representations there will be many caveats which are not normally
    present in mathematics. Two cases:

    1. Integer arithmetic where all values - including intermediate results
    - remain in range for the data type. In this, the computer implements
    normal mathematics.

    2. Integer arithmetic where either a result or an intermediate value
    does not fit in the range assigned. For these a decision has to be made
    (by hardware, by language or by compiler) as to what to do with the non-compliant value. As you say, there are various options but they have
    to be cast semantically in terms of "if this happens then do that"
    rather than following the normal rules of mathematics. Worse, exactly
    where the limits apply can even depend on implementation.

    Essentially, you and I appear to have a difference over what one should
    see as 'mathematics' but I don't think we disagree over substance.

    Unfortunately, many programming tutorials encourage programmers to
    simply assume that any value is 'large enough' and so will behave
    according to the normal rules of mathematics. But as everyone here
    knows, that is not always the case, as was shown in the example I saw
    discussed recently of

    255 + 1

    what that results in is decided by what I would call 'engineering', and
    not by the normal rules of mathematics.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Mon Oct 4 00:05:15 2021
    On 03/10/2021 20:27, Bart wrote:
    On 03/10/2021 19:06, David Brown wrote:
    On 03/10/2021 09:00, James Harris wrote:
    On 01/10/2021 21:19, Bart wrote:
    On 01/10/2021 19:54, Dmitry A. Kazakov wrote:
    On 2021-10-01 19:54, James Harris wrote:
    On 25/09/2021 19:49, Dmitry A. Kazakov wrote:
    On 2021-09-25 20:38, James Harris wrote:

    My thinking was that if Bart is setting the problem he is at
    liberty to define the parameters thereof.

    No, he must stay within the rational framework. Redefining
    mathematics is irrational.

    AIUI Bart specified unsigned numbers, not modular arithmetic.

    There is no such thing. He might mean "natural" numbers, then he
    should have said so. Using the term "unsigned" is misguided because
    it is C's nomenclature. There is nothing natural in C... (:-))

    Fortunately I'm not a mathematician so can get on with things without
    getting bogged down with such stuff.

    Be careful not to accept Dmitry's beration! Programming is not
    mathematics: it's both a superset and a subset thereof. Where we use
    small integers mathematics and programming are consistent and that leads >>> to false expectations of wider harmony but that's misleading when values >>> approach the limits of a certain type. Your use of "unsigned" is better
    than Dmitry's "natural" for that reason. Natural integers are a
    theoretical concept useful in mathematics. By contrast, unsigned
    integers, because they are familiar from their use in C and especially
    if we can assume a certain representation and, thus, behaviour, are an
    engineering concept better suited to programming. As a consequence IMO
    they can help towards a clear specification.


    I'm sorry, but that is just wrong.  The only explanation I have is that
    you think "mathematics" means "arithmetic I learned in primary school".

    And what's wrong with that? AFAICS arithmetic is exactly what a
    processor is designed to do.

    Processors are designed to do many things. Exactly duplicating standard mathematical integers is not one of those things. Being usable to model
    a limited version of those integers - following somewhat different
    mathematical rules and definitions - /is/ one of those things.


    It's no different from doing sums with an abacus which uses a fixed
    number of digits.

    How do signed, unsigned, wrapping, overflow, saturating etc work with an abacus?

      The way integer types work in programming is all defined
    mathematically.

    How they work in a processor depends on the ALU design. Most I've looked
    at work the same way. Departing significantly from that in a language
    can be expensive as you'd have to emulate the behaviour.


    Requiring additional features and semantics can be expensive. Requiring
    fewer can be cheaper.

    It doesn't matter whether you are looking at
    implementations in terms of bits and representations, or the higher
    level usage of the integers.  It doesn't matter if you are talking
    signed, unsigned, wrapping, overflowing, saturating, or anything else -
    it's all mathematics.  Even the concept of undefined behaviour is
    solidly embedded in mathematics.

    Personally I'd prefer if mathematics kept its nose out of things.
    Especially in programming language design with its type theory and
    lambda calculus and all the the rest.


    Yes, I know you are happy with a ducttape and string solution - if it
    looks okay, and tests okay, it's okay by you. Many of us prefer a more
    solid theoretical and mathematical foundation to what we do - we'd
    rather /know/ it is okay, in addition to testing that it is okay (since
    we all get our sums wrong occasionally).

    You only end up with exotic FP languages that only professors of
    computer science can understand.


    No. The point of having mathematicians, computer scientists, and other
    more theoretical people involved is so that you have the right base for
    what you are doing. /Then/ you can let more practical-minded (but
    perhaps more problem-focused, or user-focused) people build on it.

    This is done across the industry. Some programming tasks are /hard/.
    For example, if you want to get locks, synchronisation primitives, and multi-tasking systems right, it is /hard/. You can't just play around
    at it and then give it a whirl to see if it works. You need to
    understand the theory, you need to understand the subtleties, you need
    to be able to /prove/ it is correct. Otherwise you'll end up with
    something that looks okay, tests okay, is probably simpler and more
    efficient than a solid solution - but it will sometimes break and you'll
    have no idea when, where or why.


    Obviously some of the behaviours and characteristics of the integer
    types in any real programming language will differ from those of
    everyday numbers (mostly due to size limits), but that does not mean
    they are not defined mathematically.



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Jreally ames Harris on Mon Oct 4 00:06:24 2021
    On 2021-10-03 22:20, Jreally ames Harris wrote:
    On 03/10/2021 19:06, David Brown wrote:

      The way integer types work in programming is all defined
    mathematically.  It doesn't matter whether you are looking at
    implementations in terms of bits and representations, or the higher
    level usage of the integers.  It doesn't matter if you are talking
    signed, unsigned, wrapping, overflowing, saturating, or anything else -
    it's all mathematics.  Even the concept of undefined behaviour is
    solidly embedded in mathematics.

    I have to disagree. For sure, one can express computer arithmetic in mathematical terms but because of the limits imposed by fixed
    representations there will be many caveats which are not normally
    present in mathematics.

    You took it upside down. Any computer arithmetic is an approximation
    (model) of some mathematical structure. There is simply nothing else and
    cannot be anything else. Just because in a very improbable case that
    something really new is discovered, it is studied using the mathematical apparatus, not by buggy code.

    As a rule of thumb take that you will never find anything new, only your
    bugs and errors.

    Note the word *model*. Where the model becomes inadequate certain
    well-defined actions are fired at compile time (desirable) or as a last
    resort at run time.

    But as everyone here
    knows, that is not always the case, as was shown in the example I saw discussed recently of

      255 + 1

    what that results in is decided by what I would call 'engineering', and
    not by the normal rules of mathematics.

    It is *always* decided by the model. The engineer selects the best
    fitting model for the case at hand using various criteria of choice (performance, resource limitations, lack of time and qualification,
    economical viability, maintainability etc).

    The model is normally adequate for all inputs considered legal. Illegal
    inputs are detected and processed, e.g. by handling exceptions. They are especially a subject of risk estimation.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Mon Oct 4 00:14:37 2021
    On 03/10/2021 22:20, James Harris wrote:
    On 03/10/2021 19:06, David Brown wrote:
    On 03/10/2021 09:00, James Harris wrote:
    On 01/10/2021 21:19, Bart wrote:
    On 01/10/2021 19:54, Dmitry A. Kazakov wrote:
    On 2021-10-01 19:54, James Harris wrote:
    On 25/09/2021 19:49, Dmitry A. Kazakov wrote:
    On 2021-09-25 20:38, James Harris wrote:

    My thinking was that if Bart is setting the problem he is at
    liberty to define the parameters thereof.

    No, he must stay within the rational framework. Redefining
    mathematics is irrational.

    AIUI Bart specified unsigned numbers, not modular arithmetic.

    There is no such thing. He might mean "natural" numbers, then he
    should have said so. Using the term "unsigned" is misguided because
    it is C's nomenclature. There is nothing natural in C... (:-))

    Fortunately I'm not a mathematician so can get on with things without
    getting bogged down with such stuff.

    Be careful not to accept Dmitry's beration! Programming is not
    mathematics: it's both a superset and a subset thereof. Where we use
    small integers mathematics and programming are consistent and that leads >>> to false expectations of wider harmony but that's misleading when values >>> approach the limits of a certain type. Your use of "unsigned" is better
    than Dmitry's "natural" for that reason. Natural integers are a
    theoretical concept useful in mathematics. By contrast, unsigned
    integers, because they are familiar from their use in C and especially
    if we can assume a certain representation and, thus, behaviour, are an
    engineering concept better suited to programming. As a consequence IMO
    they can help towards a clear specification.


    I'm sorry, but that is just wrong.  The only explanation I have is that
    you think "mathematics" means "arithmetic I learned in primary school".

    No need to apologise! Disagreement is the grist of discussion!


      The way integer types work in programming is all defined
    mathematically.  It doesn't matter whether you are looking at
    implementations in terms of bits and representations, or the higher
    level usage of the integers.  It doesn't matter if you are talking
    signed, unsigned, wrapping, overflowing, saturating, or anything else -
    it's all mathematics.  Even the concept of undefined behaviour is
    solidly embedded in mathematics.

    I have to disagree. For sure, one can express computer arithmetic in mathematical terms but because of the limits imposed by fixed
    representations there will be many caveats which are not normally
    present in mathematics. Two cases:

    Of course they are present in mathematics. They are merely not present
    in simple integer arithmetic. But that does not mean they are not
    defined and described mathematically.


    1. Integer arithmetic where all values - including intermediate results
    - remain in range for the data type. In this, the computer implements
    normal mathematics.

    2. Integer arithmetic where either a result or an intermediate value
    does not fit in the range assigned. For these a decision has to be made
    (by hardware, by language or by compiler) as to what to do with the non-compliant value. As you say, there are various options but they have
    to be cast semantically in terms of "if this happens then do that"
    rather than following the normal rules of mathematics. Worse, exactly
    where the limits apply can even depend on implementation.


    And it is all defined mathematically.

    We are talking finite sets with partial operations (for C-style signed integers) or closed operations (for C-style unsigned integers), rather
    than infinite sets, but it is all mathematics.

    Mathematics on standard integers doesn't define 1/0. Mathematics on a
    finite set for C-style signed integers leaves a lot more values
    undefined on more operations. It doesn't mean it is not mathematical.

    Essentially, you and I appear to have a difference over what one should
    see as 'mathematics' but I don't think we disagree over substance.


    Yes. But then, I am mathematically trained, and know a lot more about
    what it means than most people, who usually think of school-level sums
    and possibly weird stuff involving letters instead of numbers.

    Unfortunately, many programming tutorials encourage programmers to
    simply assume that any value is 'large enough' and so will behave
    according to the normal rules of mathematics. But as everyone here
    knows, that is not always the case, as was shown in the example I saw discussed recently of

      255 + 1

    what that results in is decided by what I would call 'engineering', and
    not by the normal rules of mathematics.


    Engineering is about applying the mathematical (and perhaps physical,
    chemical, etc.) laws to practical situations. An engineer who does not understand that there is a mathematical basis for what they do is in the
    wrong profession. (I certainly don't mean that they should understand
    the mathematics involved - but they should understand that there /is/ mathematics involved, and that the mathematics is what justifies the
    rules and calculations they apply.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Mon Oct 4 00:58:45 2021
    On 03/10/2021 23:05, David Brown wrote:
    On 03/10/2021 20:27, Bart wrote:

    Processors are designed to do many things. Exactly duplicating standard mathematical integers is not one of those things. Being usable to model
    a limited version of those integers - following somewhat different mathematical rules and definitions - /is/ one of those things.

    No, it's just arithmetic with a limited number of digits. And usually in binary.

    It's engineering, not maths. But of course you can apply maths to anything.

    Personally I'd prefer if mathematics kept its nose out of things.
    Especially in programming language design with its type theory and
    lambda calculus and all the the rest.


    Yes, I know you are happy with a ducttape and string solution -

    Look at the recent thread on clc there I compare my 'tabledata' feature
    for defining parallel data sets, with C's X-macros to do the same job.

    Which one was more analogous to using duct-tape and string?!

    if it
    looks okay, and tests okay, it's okay by you. Many of us prefer a more
    solid theoretical and mathematical foundation to what we do - we'd
    rather /know/ it is okay, in addition to testing that it is okay (since
    we all get our sums wrong occasionally).

    As I said, it's engineering. And also, for programming languages,
    aesthetic design.

    It's not that easy devising languages that are simple, clear and easy to
    reason about. Far better than pages of arcane symbols.

    (Look at Knuth's MIX language. Being an academic doesn't mean you're a
    whizz at language design.)

    No. The point of having mathematicians, computer scientists, and other
    more theoretical people involved is so that you have the right base for
    what you are doing. /Then/ you can let more practical-minded (but
    perhaps more problem-focused, or user-focused) people build on it.

    Fine, let the mathematicians come up with the formulae and algorithms,
    but stay out of how I want to design /my/ language.

    Most of the maths I did at school (pure and applied maths) did come in
    very useful for my work, so I appreciate some of it.

    But not when I'm brow-beaten with it by the likes of DAK.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Mon Oct 4 09:39:53 2021
    On 03/10/2021 23:14, David Brown wrote:
    On 03/10/2021 22:20, James Harris wrote:

    ...

    1. Integer arithmetic where all values - including intermediate results
    - remain in range for the data type. In this, the computer implements
    normal mathematics.

    2. Integer arithmetic where either a result or an intermediate value
    does not fit in the range assigned. For these a decision has to be made
    (by hardware, by language or by compiler) as to what to do with the
    non-compliant value. As you say, there are various options but they have
    to be cast semantically in terms of "if this happens then do that"
    rather than following the normal rules of mathematics. Worse, exactly
    where the limits apply can even depend on implementation.


    And it is all defined mathematically.

    We are talking finite sets with partial operations (for C-style signed integers) or closed operations (for C-style unsigned integers), rather
    than infinite sets, but it is all mathematics.

    OK, then how would you define integer computing's

    A - B

    in terms of mathematics?


    Mathematics on standard integers doesn't define 1/0. Mathematics on a
    finite set for C-style signed integers leaves a lot more values
    undefined on more operations. It doesn't mean it is not mathematical.

    OK, then how would you define integer computing's

    A / B

    in terms of mathematics?

    No need to reply but I'd suggest to you that because of the limits of
    computer fixed representation both of those are much more complex than
    just 'mathematics'!



    Essentially, you and I appear to have a difference over what one should
    see as 'mathematics' but I don't think we disagree over substance.


    Yes. But then, I am mathematically trained, and know a lot more about
    what it means than most people, who usually think of school-level sums
    and possibly weird stuff involving letters instead of numbers.

    That's intriguing. What do you mean by "weird stuff involving letters
    instead of numbers"?


    Unfortunately, many programming tutorials encourage programmers to
    simply assume that any value is 'large enough' and so will behave
    according to the normal rules of mathematics. But as everyone here
    knows, that is not always the case, as was shown in the example I saw
    discussed recently of

      255 + 1

    what that results in is decided by what I would call 'engineering', and
    not by the normal rules of mathematics.


    Engineering is about applying the mathematical (and perhaps physical, chemical, etc.) laws to practical situations. An engineer who does not understand that there is a mathematical basis for what they do is in the wrong profession. (I certainly don't mean that they should understand
    the mathematics involved - but they should understand that there /is/ mathematics involved, and that the mathematics is what justifies the
    rules and calculations they apply.)


    Well, I would say that engineering includes being aware of and
    accommodating limits - including those limits where simple mathematics
    breaks down and no longer applies. YMMV.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Mon Oct 4 09:29:35 2021
    On 03/10/2021 23:06, Dmitry A. Kazakov wrote:
    On 2021-10-03 22:20, Jreally ames Harris wrote:
    On 03/10/2021 19:06, David Brown wrote:

      The way integer types work in programming is all defined
    mathematically.  It doesn't matter whether you are looking at
    implementations in terms of bits and representations, or the higher
    level usage of the integers.  It doesn't matter if you are talking
    signed, unsigned, wrapping, overflowing, saturating, or anything else -
    it's all mathematics.  Even the concept of undefined behaviour is
    solidly embedded in mathematics.

    I have to disagree. For sure, one can express computer arithmetic in
    mathematical terms but because of the limits imposed by fixed
    representations there will be many caveats which are not normally
    present in mathematics.

    You took it upside down. Any computer arithmetic is an approximation
    (model) of some mathematical structure. There is simply nothing else and cannot be anything else. Just because in a very improbable case that something really new is discovered, it is studied using the mathematical apparatus, not by buggy code.

    As a rule of thumb take that you will never find anything new, only your
    bugs and errors.

    Note the word *model*. Where the model becomes inadequate certain well-defined actions are fired at compile time (desirable) or as a last resort at run time.

    I'm not sure I agree with any of that! Integer arithmetic does not
    generate approximations. For those you'd need an analogue computer (or
    possibly a quantum one?). In a digital computer all results on integer
    operands are precisely defined. They may naturally vary between
    implementations but a language can override that and define consistent
    results.


    But as everyone here knows, that is not always the case, as was shown
    in the example I saw discussed recently of

       255 + 1

    what that results in is decided by what I would call 'engineering',
    and not by the normal rules of mathematics.

    It is *always* decided by the model. The engineer selects the best
    fitting model for the case at hand using various criteria of choice (performance, resource limitations, lack of time and qualification, economical viability, maintainability etc).

    The model is normally adequate for all inputs considered legal. Illegal inputs are detected and processed, e.g. by handling exceptions. They are especially a subject of risk estimation.


    Yes, it is a model. And that model can be rather complex. It does not
    simply implement arithmetic on natural numbers.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Mon Oct 4 09:42:39 2021
    On 04/09/2021 13:35, Bart wrote:

    ...

    I just have a different way of treating numeric types. So i64 is a
    signed integer type, and i8 i16 i32 are just narrower, storage versions
    of the /same type/.

    That's an intriguing comment. Dmitry and I once had a good argument
    about what constitutes a type.

    Would you accept that i8, i16 etc are different concrete types even if
    they are the same abstract type?


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Mon Oct 4 11:50:25 2021
    On 04/10/2021 01:58, Bart wrote:
    On 03/10/2021 23:05, David Brown wrote:
    On 03/10/2021 20:27, Bart wrote:

    Processors are designed to do many things.  Exactly duplicating standard
    mathematical integers is not one of those things.  Being usable to model
    a limited version of those integers - following somewhat different
    mathematical rules and definitions - /is/ one of those things.

    No, it's just arithmetic with a limited number of digits. And usually in binary.

    It's engineering, not maths. But of course you can apply maths to anything.

    Engineering /is/ applied maths!


    Personally I'd prefer if mathematics kept its nose out of things.
    Especially in programming language design with its type theory and
    lambda calculus and all the the rest.


    Yes, I know you are happy with a ducttape and string solution -

    Look at the recent thread on clc there I compare my 'tabledata' feature
    for defining parallel data sets, with C's X-macros to do the same job.

    Which one was more analogous to using duct-tape and string?!


    I'm not sure - I'd go for your poor (probably intentionally) use of
    X-macros. Next in line would be your more sophisticated method of
    hiding the mess within your tools so that the source language is a bit
    neater. (That's a good thing.) Better, however, is to hide the messy
    macros and then get a solution that is neater and more flexible than
    either of your methods.

    <https://www.codeproject.com/Articles/1118009/A-Smart-Enum-library-in-C-using-X-macros>


     if it
    looks okay, and tests okay, it's okay by you.  Many of us prefer a more
    solid theoretical and mathematical foundation to what we do - we'd
    rather /know/ it is okay, in addition to testing that it is okay (since
    we all get our sums wrong occasionally).

    As I said, it's engineering. And also, for programming languages,
    aesthetic design.

    It's not that easy devising languages that are simple, clear and easy to reason about. Far better than pages of arcane symbols.

    (Look at Knuth's MIX language. Being an academic doesn't mean you're a
    whizz at language design.)

    Of course being an academic is no guarantee for making a good useable
    language. (And I agree that MIX is terrible. Even for its time, it was
    not a good design IMHO, as it had too many details that were irrelevant
    for its use. The replacement MMIX is significantly better, but still
    overly complex. I think he'd have been better using Algol in his books.)

    Making a /good/ programming language is not an easy task. It is not a
    task for an academic, or an engineer, or a user, or a programmer. It is
    a task for collaboration amongst many people with different viewpoints.
    You need someone with an academic, theoretical and mathematical
    background to get it /right/. You need engineering type people to make
    it practical and usable. You need people who are experienced in
    implementing languages and tools, people who use them, people who
    understand the problem domain. You need people who are good at writing technical documentation, and people who are good at testing awkward
    special cases and thinking through all the possible corner cases. Some
    of these "people" can, of course, be combined. But without enough
    people involved at the right times, the best you can get are small
    languages, niche DSL's, prototypes, hobby languages. These can be
    interesting, occasionally useful, and they can grow to something more.


    No.  The point of having mathematicians, computer scientists, and other
    more theoretical people involved is so that you have the right base for
    what you are doing.  /Then/ you can let more practical-minded (but
    perhaps more problem-focused, or user-focused) people build on it.

    Fine, let the mathematicians come up with the formulae and algorithms,
    but stay out of how I want to design /my/ language.

    Most of the maths I did at school (pure and applied maths) did come in
    very useful for my work, so I appreciate some of it.

    But not when I'm brow-beaten with it by the likes of DAK.

    Dmitry works with programs that /must/ work correctly. I don't know
    exactly what he does, but you don't often use Ada for code where it is acceptable to have a program crash with an error message and you just
    restart it. One clue is that if you are writing for Windows (or DOS)
    systems, then you are almost certainly in a different category with
    regards to the balance between "quick and cheap to write, looks like it
    works well enough" and "the development price doesn't matter because
    mistakes cost lives".

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Mon Oct 4 11:15:58 2021
    On 2021-10-04 10:29, James Harris wrote:
    On 03/10/2021 23:06, Dmitry A. Kazakov wrote:
    On 2021-10-03 22:20, Jreally ames Harris wrote:
    On 03/10/2021 19:06, David Brown wrote:

      The way integer types work in programming is all defined
    mathematically.  It doesn't matter whether you are looking at
    implementations in terms of bits and representations, or the higher
    level usage of the integers.  It doesn't matter if you are talking
    signed, unsigned, wrapping, overflowing, saturating, or anything else - >>>> it's all mathematics.  Even the concept of undefined behaviour is
    solidly embedded in mathematics.

    I have to disagree. For sure, one can express computer arithmetic in
    mathematical terms but because of the limits imposed by fixed
    representations there will be many caveats which are not normally
    present in mathematics.

    You took it upside down. Any computer arithmetic is an approximation
    (model) of some mathematical structure. There is simply nothing else
    and cannot be anything else. Just because in a very improbable case
    that something really new is discovered, it is studied using the
    mathematical apparatus, not by buggy code.

    As a rule of thumb take that you will never find anything new, only
    your bugs and errors.

    Note the word *model*. Where the model becomes inadequate certain
    well-defined actions are fired at compile time (desirable) or as a
    last resort at run time.

    I'm not sure I agree with any of that! Integer arithmetic does not
    generate approximations.

    Approximation means that the model is inexact, integers are
    incomputable, so you must give up something, typically the range.

    The case of real numbers is more showing. There is the fixed-point model
    with exact addition and multiplication and there is the floating-point
    model with all operations inexact and even non-associative.

    In a digital computer all results on integer
    operands are precisely defined.

    Precisely defined /= exact. For example, integer division is inexact, it rounds, yet it is precisely defined, e.g. rounding towards zero.

    There are other ways to model integer numbers. It depends on the goal.
    E.g. integer interval arithmetic where all operations including division
    are accurate. Interval division is accurate, but imprecise: 1/2 = [0,1].

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Mon Oct 4 10:38:55 2021
    On 04/10/2021 10:15, Dmitry A. Kazakov wrote:
    On 2021-10-04 10:29, James Harris wrote:

    ...

    I'm not sure I agree with any of that! Integer arithmetic does not
    generate approximations.

    Approximation means that the model is inexact, integers are
    incomputable, so you must give up something, typically the range.

    The case of real numbers is more showing. There is the fixed-point model
    with exact addition and multiplication and there is the floating-point
    model with all operations inexact and even non-associative.

    Non-associativity can also apply to integers. Consider

    A + B + C

    Even in such a simple expression if overflow of intermediate results is
    to be detected then (A + B) + C is not A + (B + C).

    I'll not go in to the rest as I think we've discussed it to within an
    inch of its life.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Mon Oct 4 12:26:16 2021
    On 2021-10-04 11:38, James Harris wrote:
    On 04/10/2021 10:15, Dmitry A. Kazakov wrote:
    On 2021-10-04 10:29, James Harris wrote:

    ...

    I'm not sure I agree with any of that! Integer arithmetic does not
    generate approximations.

    Approximation means that the model is inexact, integers are
    incomputable, so you must give up something, typically the range.

    The case of real numbers is more showing. There is the fixed-point
    model with exact addition and multiplication and there is the
    floating-point model with all operations inexact and even
    non-associative.

    Non-associativity can also apply to integers. Consider

      A + B + C

    Even in such a simple expression if overflow of intermediate results is
    to be detected then (A + B) + C is not A + (B + C).

    No, overflow is outside the model. Inside the model computer integer
    arithmetic is associative.

    [ Equivalent modification of computing algorithms to keep the model
    adequate is yet another issue, and also mathematics. ]

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Mon Oct 4 11:57:16 2021
    On 04/10/2021 09:42, James Harris wrote:
    On 04/09/2021 13:35, Bart wrote:

    ...

    I just have a different way of treating numeric types. So i64 is a
    signed integer type, and i8 i16 i32 are just narrower, storage
    versions of the /same type/.

    That's an intriguing comment. Dmitry and I once had a good argument
    about what constitutes a type.

    Would you accept that i8, i16 etc are different concrete types even if
    they are the same abstract type?


    There are clearly differences in how a compiler needs to implement them.
    So in one of mine, it will have these enums:

    ti8 ti16 ti32 ti64

    Here it looks like i64 is just one of those four types. However i64 is
    much more dominant as it is the primary promotion type, most operations
    are defined as i64 etc. All special behaviour that needs to be gleaned
    from reading the compiler code.

    But look at how it might be defined in the language:

    int primary integer type
    int:32 same type of which only 32 bits are stored
    int:8 int type but only least significant 8 bytes are stored

    Here now it does look like a single type, with an optional attribute
    giving some info about how much is stored. A bit like the number of
    boxes on that paper form I gave in my example.

    A language could make 'int' be unbounded, and some languages do that,
    but in this low-level one, int itself has an upper limit on how many
    bits are stored, which is 64, and there is also a limit on how many bits
    it will calculated, again 64 bits.

    (Conceivably it could do arbitrary calculations where inputs are limited
    to 64 bits, but intermediates could be any size.)


    Look now inside another compiler, for my dynamic language. That includes
    these enums:


    tint, tword, treal, tdecimal

    Now 'int' is clearly just one type. Elsewhere in those enums however are
    also these Packed types:

    tpi8 tpi16 tpi32 tpi64

    It's the same matter of storage attributes, but handled differently.

    This language deals only with 'int' (which is 64 bits) and interacts
    with those other 3 types. It never deals directly with tpi32 etc, which
    control layout in packed structs and arrays, and which are ALWAYS
    converted to and from the official int type before doing any work.

    The language could have chosen to have 'int' unbounded so that it blends seamlessly into the arbitrary precision type (decimal here), like Python
    does. But I always wanted it to be efficient, and predictable as to
    result type (I'd had type inference in mind). I also wanted similar
    semantics to the static language.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Mon Oct 4 20:19:55 2021
    On 04/10/2021 10:39, James Harris wrote:
    On 03/10/2021 23:14, David Brown wrote:
    On 03/10/2021 22:20, James Harris wrote:

    ...

    1. Integer arithmetic where all values - including intermediate results
    - remain in range for the data type. In this, the computer implements
    normal mathematics.

    2. Integer arithmetic where either a result or an intermediate value
    does not fit in the range assigned. For these a decision has to be made
    (by hardware, by language or by compiler) as to what to do with the
    non-compliant value. As you say, there are various options but they have >>> to be cast semantically in terms of "if this happens then do that"
    rather than following the normal rules of mathematics. Worse, exactly
    where the limits apply can even depend on implementation.


    And it is all defined mathematically.

    We are talking finite sets with partial operations (for C-style signed
    integers) or closed operations (for C-style unsigned integers), rather
    than infinite sets, but it is all mathematics.

    OK, then how would you define integer computing's

      A - B

    in terms of mathematics?


    Mathematics on standard integers doesn't define 1/0.  Mathematics on a
    finite set for C-style signed integers leaves a lot more values
    undefined on more operations.  It doesn't mean it is not mathematical.

    OK, then how would you define integer computing's

      A / B

    in terms of mathematics?

    No need to reply but I'd suggest to you that because of the limits of computer fixed representation both of those are much more complex than
    just 'mathematics'!


    I'd agree that they are somewhat complicated by the limited sizes of
    fixed-size types - but they are still just "mathematics".

    How about:

    1. For a given fixed size of computer integer, "A - B" is defined as the
    result of normal mathematical integer subtraction as long as that result
    fits within the type.

    (That's basically how C defines it, if you stick to "int" or ignore the promotion stuff.)

    or

    2. "A - B" is defined as the result of normal mathematical integer
    subtraction, reduced modulo 2^n as necessary to fit in the range of the
    type.

    (That's how "gcc -fwrapv" defines it.)

    or

    3. "A - B" is defined as the result of mid(int_min, A - B, int_max).

    or

    4. "A - B" is defined as either the result of normal integer subtraction
    if that fits within the range of the type, or an exception condition
    otherwise.

    These are all perfectly reasonable mathematical definitions for
    subtraction. Note that partial functions, such as in definition 1, are
    quite standard in mathematics.



    Essentially, you and I appear to have a difference over what one should
    see as 'mathematics' but I don't think we disagree over substance.


    Yes.  But then, I am mathematically trained, and know a lot more about
    what it means than most people, who usually think of school-level sums
    and possibly weird stuff involving letters instead of numbers.

    That's intriguing. What do you mean by "weird stuff involving letters
    instead of numbers"?


    That is how many people think of algebra. A lot of people will tell you
    they are fine with basic arithmetic, but get lost once there's an "x" in
    there. (I am not implying that you, or anyone else here, is at that
    limited level!)


    Unfortunately, many programming tutorials encourage programmers to
    simply assume that any value is 'large enough' and so will behave
    according to the normal rules of mathematics. But as everyone here
    knows, that is not always the case, as was shown in the example I saw
    discussed recently of

       255 + 1

    what that results in is decided by what I would call 'engineering', and
    not by the normal rules of mathematics.


    Engineering is about applying the mathematical (and perhaps physical,
    chemical, etc.) laws to practical situations.  An engineer who does not
    understand that there is a mathematical basis for what they do is in the
    wrong profession.  (I certainly don't mean that they should understand
    the mathematics involved - but they should understand that there /is/
    mathematics involved, and that the mathematics is what justifies the
    rules and calculations they apply.)


    Well, I would say that engineering includes being aware of and
    accommodating limits - including those limits where simple mathematics
    breaks down and no longer applies. YMMV.


    Mathematics doesn't break down. That's the point.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Mon Oct 4 20:55:49 2021
    On 04/10/2021 19:19, David Brown wrote:
    On 04/10/2021 10:39, James Harris wrote:
    On 03/10/2021 23:14, David Brown wrote:
    On 03/10/2021 22:20, James Harris wrote:

    ...

    1. Integer arithmetic where all values - including intermediate results >>>> - remain in range for the data type. In this, the computer implements
    normal mathematics.

    2. Integer arithmetic where either a result or an intermediate value
    does not fit in the range assigned. For these a decision has to be made >>>> (by hardware, by language or by compiler) as to what to do with the
    non-compliant value. As you say, there are various options but they have >>>> to be cast semantically in terms of "if this happens then do that"
    rather than following the normal rules of mathematics. Worse, exactly
    where the limits apply can even depend on implementation.


    And it is all defined mathematically.

    We are talking finite sets with partial operations (for C-style signed
    integers) or closed operations (for C-style unsigned integers), rather
    than infinite sets, but it is all mathematics.

    OK, then how would you define integer computing's

      A - B

    in terms of mathematics?


    Mathematics on standard integers doesn't define 1/0.  Mathematics on a
    finite set for C-style signed integers leaves a lot more values
    undefined on more operations.  It doesn't mean it is not mathematical.

    OK, then how would you define integer computing's

      A / B

    in terms of mathematics?

    No need to reply but I'd suggest to you that because of the limits of
    computer fixed representation both of those are much more complex than
    just 'mathematics'!


    I'd agree that they are somewhat complicated by the limited sizes of fixed-size types - but they are still just "mathematics".

    How about:

    1. For a given fixed size of computer integer, "A - B" is defined as the result of normal mathematical integer subtraction as long as that result
    fits within the type.

    (That's basically how C defines it, if you stick to "int" or ignore the promotion stuff.)

    or

    2. "A - B" is defined as the result of normal mathematical integer subtraction, reduced modulo 2^n as necessary to fit in the range of the
    type.

    (That's how "gcc -fwrapv" defines it.)

    or

    3. "A - B" is defined as the result of mid(int_min, A - B, int_max).

    or

    4. "A - B" is defined as either the result of normal integer subtraction
    if that fits within the range of the type, or an exception condition otherwise.

    These are all perfectly reasonable mathematical definitions for
    subtraction. Note that partial functions, such as in definition 1, are
    quite standard in mathematics.

    There are other ways of defining these operations.

    With a word size of 8 bits, and operands A, B and results are of an u8
    type, you can just enumerate all possible results of A + B. Maybe use a function F(A,B) or a table [0..255,0..255]u8 T.

    The results will generally correspond to doing that operation on a real
    ALU, but you could in theory define the operations however you like.

    See, you don't need maths, unless you're of those who says you need
    mathematics to show that 1+1 is 2.

    Since adding all possible A+B will give results of 0..510, many won't correspond to the same sum done using school arithmetic with
    unrestricted values.

    That's where the definition above comes in. As language designer, you
    call the shots. In practice, you go along with the behaviour of real CPUs.

    Or you can just play the C card and say it's UB in those cases.

    If your language is of a high enough level and/or QoI, then you can
    expend some effort in more idealised behaviour.

    Well, I would say that engineering includes being aware of and
    accommodating limits - including those limits where simple mathematics
    breaks down and no longer applies. YMMV.


    Mathematics doesn't break down. That's the point.

    No? How would you define mathematically this function:

    function f(n) =
    if n = 666 then
    return "ABC"
    else
    return n
    fi
    end

    If you don't like that string, change it to some arbitrary integer. Once
    you've done that, insert:

    system("format c:")

    before that first return. (I won't try this one in case it actually does
    it!)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Mon Oct 4 22:25:35 2021
    On 2021-10-04 21:55, Bart wrote:

    There are other ways of defining these operations.

    With a word size of 8 bits, and operands A, B and results are of an u8
    type, you can just enumerate all possible results of A + B. Maybe use a function F(A,B) or a table [0..255,0..255]u8 T.

    The results will generally correspond to doing that operation on a real
    ALU, but you could in theory define the operations however you like.

    See, you don't need maths, unless you're of those who says you need mathematics to show that 1+1 is 2.

    You must be trolling again, because nobody could be so ignorant.

    Never heard of:

    - multiplication table
    - function table
    - logarithmic tables
    - artillery tables
    - many thousands of books like M. Abramowitz and I. A. Stegun "Handbook
    of Mathematical Functions With Formulas, Graphs, and Mathematical Tables"

    For primary school pupils:

    https://www.theclassroom.com/calculate-final-grade-6372198.html

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Mon Oct 4 22:38:05 2021
    On 04/10/2021 21:25, Dmitry A. Kazakov wrote:
    On 2021-10-04 21:55, Bart wrote:

    There are other ways of defining these operations.

    With a word size of 8 bits, and operands A, B and results are of an u8
    type, you can just enumerate all possible results of A + B. Maybe use
    a function F(A,B) or a table [0..255,0..255]u8 T.

    The results will generally correspond to doing that operation on a
    real ALU, but you could in theory define the operations however you like.

    See, you don't need maths, unless you're of those who says you need
    mathematics to show that 1+1 is 2.

    You must be trolling again, because nobody could be so ignorant.

    Never heard of:

    - multiplication table
    - function table
    - logarithmic tables
    - artillery tables

    That's basic arithmetic and basic maths that everyone uses (perhaps
    artillery tables not so much).

    I just don't see the need for stuff like this:

    1. There is the set N of natural numbers which is a subset of the set of integer numbers Z. There is nothing special about N that could
    distinguish it from any other subrange of Z, like 100..1000. Implement ranges and be done with that.

    High-falutin' concepts that are irrelevant for the stuff I do, and which
    I've done perfectly well without for the last 40 years.

    If /you/ want to apply them to your own language, that's fine; do what
    you want.

    But don't belittle anyone else's attempts who likes to use a different
    approach and prioritises different aspects.

    So, a u8 type can be used to represent numerical values from 0 to 255;
    what more can be said about it? This is pretty universal across dozens
    of languages.

    What can some fancy mathematical terms (which are mostly likely just
    showing off) bring to the table?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Bart on Tue Oct 5 02:05:58 2021
    On 04/10/2021 22:38, Bart wrote:

    That's ....

    <snip>

    Please ignore my post. It was in response to this:

    You must be trolling again, because nobody could be so ignorant.

    Such remarks really wind me up. I don't respond well to insults,
    especially in threads about language design, by people who have
    apparently never done it, and who mainly nitpick others' efforts.

    My newsreader will now ignore that poster.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Tue Oct 5 16:13:29 2021
    On 04/10/2021 21:55, Bart wrote:
    On 04/10/2021 19:19, David Brown wrote:
    On 04/10/2021 10:39, James Harris wrote:
    On 03/10/2021 23:14, David Brown wrote:
    On 03/10/2021 22:20, James Harris wrote:

    ...

    1. Integer arithmetic where all values - including intermediate
    results
    - remain in range for the data type. In this, the computer implements >>>>> normal mathematics.

    2. Integer arithmetic where either a result or an intermediate value >>>>> does not fit in the range assigned. For these a decision has to be
    made
    (by hardware, by language or by compiler) as to what to do with the
    non-compliant value. As you say, there are various options but they
    have
    to be cast semantically in terms of "if this happens then do that"
    rather than following the normal rules of mathematics. Worse, exactly >>>>> where the limits apply can even depend on implementation.


    And it is all defined mathematically.

    We are talking finite sets with partial operations (for C-style signed >>>> integers) or closed operations (for C-style unsigned integers), rather >>>> than infinite sets, but it is all mathematics.

    OK, then how would you define integer computing's

       A - B

    in terms of mathematics?


    Mathematics on standard integers doesn't define 1/0.  Mathematics on a >>>> finite set for C-style signed integers leaves a lot more values
    undefined on more operations.  It doesn't mean it is not mathematical. >>>
    OK, then how would you define integer computing's

       A / B

    in terms of mathematics?

    No need to reply but I'd suggest to you that because of the limits of
    computer fixed representation both of those are much more complex than
    just 'mathematics'!


    I'd agree that they are somewhat complicated by the limited sizes of
    fixed-size types - but they are still just "mathematics".

    How about:

    1. For a given fixed size of computer integer, "A - B" is defined as the
    result of normal mathematical integer subtraction as long as that result
    fits within the type.

    (That's basically how C defines it, if you stick to "int" or ignore the
    promotion stuff.)

    or

    2. "A - B" is defined as the result of normal mathematical integer
    subtraction, reduced modulo 2^n as necessary to fit in the range of the
    type.

    (That's how "gcc -fwrapv" defines it.)

    or

    3. "A - B" is defined as the result of mid(int_min, A - B, int_max).

    or

    4. "A - B" is defined as either the result of normal integer subtraction
    if that fits within the range of the type, or an exception condition
    otherwise.

    These are all perfectly reasonable mathematical definitions for
    subtraction.  Note that partial functions, such as in definition 1, are
    quite standard in mathematics.

    There are other ways of defining these operations.


    Indeed there are. I wasn't trying to be complete.

    With a word size of 8 bits, and operands A, B and results are of an u8
    type, you can just enumerate all possible results of A + B. Maybe use a function F(A,B) or a table [0..255,0..255]u8 T.

    The results will generally correspond to doing that operation on a real
    ALU, but you could in theory define the operations however you like.

    You are mixing a definition of the operation (the specification, if you
    prefer) with implementations. The implementation is irrelevant to the definition of the function or operation.

    You could, I suppose, write out all the results of the addition in a
    huge table and call that your specification.


    See, you don't need maths, unless you're of those who says you need mathematics to show that 1+1 is 2.

    Are you trolling, or do you not understand what mathematics is? If you
    are giving precise definitions of the operations and functions
    (including partial functions - that is, leaving the results for some
    inputs as undefined) then it is /maths/. If you define the operation A
    + B on numbers from two sets, based on a complete or partial enumeration
    of all the results, then it is /maths/.

    If I say I want my language to have an operation ¤ that works on two-bit numbers according to the table:

    | 00 | 01 | 10 | 11
    ----------------------
    00 | 00 | 11 | xx | 11
    01 | 10 | 11 | xx | 11
    10 | 11 | xx | xx | 11
    11 | xx | 10 | 00 | 11

    where "xx" means "the function is not defined on these inputs", then
    that is /maths/. It is a mathematical definition. It tells you what combinations of inputs are valid, and how to calculate the results of
    any given valid set of inputs.


    Well, I would say that engineering includes being aware of and
    accommodating limits - including those limits where simple mathematics
    breaks down and no longer applies. YMMV.


    Mathematics doesn't break down.  That's the point.

    No? How would you define mathematically this function:

        function f(n) =
            if n = 666 then
                return "ABC"
            else
                return n
            fi
        end


    That in itself is nearly a mathematical definition of the function. It
    is precise and unambiguous, merely missing information about the set of
    input values. (The set of output values is implied by the function
    definition, once the possible inputs are defined.)

    Why would you think it was /not/ mathematics? Do you think "if" is not
    allowed in mathematics? Do you think mathematics is restricted to plus,
    minus, times and divide?

    If you don't like that string, change it to some arbitrary integer. Once you've done that, insert:

       system("format c:")

    before that first return. (I won't try this one in case it actually does
    it!)


    When you are defining something, you don't get to bring in things from
    outside unless they are also defined. So if you are going to make a mathematical specification or definition for your new function, you're
    going to have to include (or refer to) a mathematical specification of
    the function "system", and everything that entails (which is going to be
    a /lot/).

    Could you make such a specification? Yes. Would you? Probably not.

    However, we are not talking about that kind of thing. We are talking
    about functions and operations that /can/ be defined mathematically in a reasonable amount of time and space.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Tue Oct 5 16:47:47 2021
    On 05/10/2021 15:13, David Brown wrote:
    On 04/10/2021 21:55, Bart wrote:

    There are other ways of defining these operations.


    Indeed there are. I wasn't trying to be complete.

    I meant defining your categories.

    With a word size of 8 bits, and operands A, B and results are of an u8
    type, you can just enumerate all possible results of A + B. Maybe use a
    function F(A,B) or a table [0..255,0..255]u8 T.

    The results will generally correspond to doing that operation on a real
    ALU, but you could in theory define the operations however you like.

    You are mixing a definition of the operation (the specification, if you prefer) with implementations. The implementation is irrelevant to the definition of the function or operation.

    You could, I suppose, write out all the results of the addition in a
    huge table and call that your specification.


    See, you don't need maths, unless you're of those who says you need
    mathematics to show that 1+1 is 2.

    Are you trolling,

    Again with the trolling...

    or do you not understand what mathematics is?

    Maybe I don't. I stopped 'getting it' since I started being involved
    with computers.

    If you
    are giving precise definitions of the operations and functions
    (including partial functions - that is, leaving the results for some
    inputs as undefined) then it is /maths/. If you define the operation A
    + B on numbers from two sets, based on a complete or partial enumeration
    of all the results, then it is /maths/.

    If I say I want my language to have an operation ¤ that works on two-bit numbers according to the table:

    | 00 | 01 | 10 | 11
    ----------------------
    00 | 00 | 11 | xx | 11
    01 | 10 | 11 | xx | 11
    10 | 11 | xx | xx | 11
    11 | xx | 10 | 00 | 11

    (Is there supposed to be a pattern here?)

    where "xx" means "the function is not defined on these inputs", then
    that is /maths/. It is a mathematical definition. It tells you what combinations of inputs are valid, and how to calculate the results of
    any given valid set of inputs.

    If you want to call it maths, then fine. (Is there anything that isn't
    maths then?)

    But if there existed a huge table to define the possible values of i64 +
    i64, where overflow wraps as it does on an x64 processors, or one where overflow is xx as it is in C, that still wouldn't satisfy DAK despite it
    being apparently valid mathematical behaviour.

    There is something else about it that he just doesn't like. But he
    brings his superior knowledge of maths to it in an effort to prove that
    this is wrong behaviour, and the right behaviour can only be what he
    says it is.


    No? How would you define mathematically this function:

        function f(n) =
            if n = 666 then
                return "ABC"
            else
                return n
            fi
        end


    That in itself is nearly a mathematical definition of the function. It
    is precise and unambiguous, merely missing information about the set of
    input values. (The set of output values is implied by the function definition, once the possible inputs are defined.)

    If I run this program, n can be any value that can be represented by the
    types of that dynamic language, which include numeric types, strings,
    lists, dicts etc. "=" is defined between any two types.

    It will only return "ABC" when n is:

    * Integer 666
    * Unsigned integer 666
    * Float 666.0
    * Decimal 666L
    * String "ABC"

    Otherwise it returns n. (There is no overload mechanism for "+", which
    /would/ make poorly defined.)

    However, this is still too lax for certain people with an irrational
    dislike of dynamic typing, even when you show that these types+values
    can all be considered runtime data of the one variant type.

    Why would you think it was /not/ mathematics? Do you think "if" is not allowed in mathematics?

    Not in the maths I did.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Tue Oct 5 19:35:51 2021
    On 05/10/2021 17:47, Bart wrote:
    On 05/10/2021 15:13, David Brown wrote:
    On 04/10/2021 21:55, Bart wrote:

    There are other ways of defining these operations.


    Indeed there are.  I wasn't trying to be complete.

    I meant defining your categories.

    With a word size of 8 bits, and operands A, B and results are of an u8
    type, you can just enumerate all possible results of A + B. Maybe use a
    function F(A,B) or a table [0..255,0..255]u8 T.

    The results will generally correspond to doing that operation on a real
    ALU, but you could in theory define the operations however you like.

    You are mixing a definition of the operation (the specification, if you
    prefer) with implementations.  The implementation is irrelevant to the
    definition of the function or operation.

    You could, I suppose, write out all the results of the addition in a
    huge table and call that your specification.


    See, you don't need maths, unless you're of those who says you need
    mathematics to show that 1+1 is 2.

    Are you trolling,

    Again with the trolling...

    or do you not understand what mathematics is?

    Maybe I don't. I stopped 'getting it' since I started being involved
    with computers.

    But you still feel qualified to argue that computing is not mathematics,
    or that computer operations are not mathematically defined?


     If you
    are giving precise definitions of the operations and functions
    (including partial functions - that is, leaving the results for some
    inputs as undefined) then it is /maths/.  If you define the operation A
    + B on numbers from two sets, based on a complete or partial enumeration
    of all the results, then it is /maths/.

    If I say I want my language to have an operation ¤ that works on two-bit
    numbers according to the table:

        | 00 | 01 | 10 | 11
    ----------------------
    00 | 00 | 11 | xx | 11
    01 | 10 | 11 | xx | 11
    10 | 11 | xx | xx | 11
    11 | xx | 10 | 00 | 11

    (Is there supposed to be a pattern here?)


    No, no pattern - it is mathematically defined by the table.

    where "xx" means "the function is not defined on these inputs", then
    that is /maths/.  It is a mathematical definition.  It tells you what
    combinations of inputs are valid, and how to calculate the results of
    any given valid set of inputs.

    If you want to call it maths, then fine. (Is there anything that isn't
    maths then?)

    Yes, there is lots that is not maths. For example, anything subjective
    is not maths - how user-friendly a programming language is, or how
    elegant, is not mathematical. But what the language /does/ and the
    semantics of the language should be mathematically definable.


    But if there existed a huge table to define the possible values of i64 +
    i64, where overflow wraps as it does on an x64 processors, or one where overflow is xx as it is in C, that still wouldn't satisfy DAK despite it being apparently valid mathematical behaviour.


    I will have to let Dmitry answer for himself.

    There is something else about it that he just doesn't like. But he
    brings his superior knowledge of maths to it in an effort to prove that
    this is wrong behaviour, and the right behaviour can only be what he
    says it is.


    Different behaviours of, for example, signed integer arithmetic can have perfectly good mathematical definitions. In that sense it is not
    possible to call wrapping semantics, or undefined overflow semantics,
    "right" or "wrong" - they are both valid, and you can give clear
    mathematical definitions in both cases. You can use these definitions
    to prove things about the arithmetic, such as commutative laws. Some
    things can be proved about one version of the definition but not others
    - with C-style signed arithmetic, you can prove that "x + 1 > x" is
    always true.


    No? How would you define mathematically this function:

         function f(n) =
             if n = 666 then
                 return "ABC"
             else
                 return n
             fi
         end


    That in itself is nearly a mathematical definition of the function.  It
    is precise and unambiguous, merely missing information about the set of
    input values.  (The set of output values is implied by the function
    definition, once the possible inputs are defined.)

    If I run this program, n can be any value that can be represented by the types of that dynamic language, which include numeric types, strings,
    lists, dicts etc. "=" is defined between any two types.

    It will only return "ABC" when n is:

    * Integer 666
    * Unsigned integer 666
    * Float 666.0
    * Decimal 666L
    * String "ABC"

    Otherwise it returns n. (There is no overload mechanism for "+", which /would/ make poorly defined.)


    That's fine. To be a mathematical definition you need to define the set
    for the inputs, and since you are not using "=" to mean mathematical
    equality, you'd need to define that too. But it is certainly possible
    to do that.

    However, this is still too lax for certain people with an irrational
    dislike of dynamic typing, even when you show that these types+values
    can all be considered runtime data of the one variant type.


    A dislike of dynamic typing is not irrational - nor is a dislike of
    strong static typing. (And you really mean "weak" typing, rather than "dynamic" typing here.) The kind of typing you have in that language
    can be convenient for simple and quick-to-write scripts, but it makes it
    very easy to make mistakes that would be caught immediately by tools in
    a language with stronger typing. Different balances have their pros and
    cons, and their use-cases where they excel or fail.

    It's okay to prefer different kinds of language - either as a general preference, or for particular tasks. I personally prefer stronger
    static typing for embedded development. (I want stronger than C or C++
    has by the standards and default types, and use gcc's warnings to
    enforce stronger typing in some situations.) I am happy with dynamic
    typing for a lot of my PC programming, where it is often relatively
    small programs (thus I use Python, which has strong dynamic typing - I'm
    less keen on the weaker dynamic typing of, say, PHP).

    I don't think a dislike of any one system is necessarily irrational - if
    you can justify it, at least in terms of the coding you do, then the
    dislike is not irrational.

    Why would you think it was /not/ mathematics?  Do you think "if" is not
    allowed in mathematics?

    Not in the maths I did.


    You never came across functions like Dirac's delta function (very
    popular in engineering circles) :

    𝛿(x) = ⎧ +∞, if x = 0

    ⎩ 0, otherwise

    or

    abs(x) = ⎧ x, if x >= 0

    ⎩ -x, if x < 0

    ?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Tue Oct 5 20:13:45 2021
    On 05/10/2021 18:35, David Brown wrote:

    ...

    You never came across functions like Dirac's delta function (very
    popular in engineering circles) :

    𝛿(x) = ⎧ +∞, if x = 0

    ⎩ 0, otherwise

    or

    abs(x) = ⎧ x, if x >= 0

    ⎩ -x, if x < 0

    ?

    I haven't read the messages which led to this one, yet, but this one
    caught my eye due to the graphics - the delta symbol and the clever
    extended bracing. Very impressive!

    But something else stood out particularly in the context of this
    subthread. The definition of abs(x) would fail on a computer if it were
    using 2's complement representation and x was the most negative number.
    It's a classic case in point where the computer won't follow the
    accepted mathematical definition.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Tue Oct 5 20:36:40 2021
    On 04/10/2021 11:26, Dmitry A. Kazakov wrote:
    On 2021-10-04 11:38, James Harris wrote:
    On 04/10/2021 10:15, Dmitry A. Kazakov wrote:
    On 2021-10-04 10:29, James Harris wrote:

    ...

    I'm not sure I agree with any of that! Integer arithmetic does not
    generate approximations.

    Approximation means that the model is inexact, integers are
    incomputable, so you must give up something, typically the range.

    The case of real numbers is more showing. There is the fixed-point
    model with exact addition and multiplication and there is the
    floating-point model with all operations inexact and even
    non-associative.

    Non-associativity can also apply to integers. Consider

       A + B + C

    Even in such a simple expression if overflow of intermediate results
    is to be detected then (A + B) + C is not A + (B + C).

    No, overflow is outside the model. Inside the model computer integer arithmetic is associative.

    Then the model is inadequate - and that's party my point. Computers do
    not implement the normal mathematical model for integers, but a subset
    and a superset thereof, even for something as simple as addition of
    integers.


    [ Equivalent modification of computing algorithms to keep the model
    adequate is yet another issue, and also mathematics. ]



    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Tue Oct 5 20:45:14 2021
    On 05/10/2021 20:36, James Harris wrote:
    On 04/10/2021 11:26, Dmitry A. Kazakov wrote:
    On 2021-10-04 11:38, James Harris wrote:
    On 04/10/2021 10:15, Dmitry A. Kazakov wrote:
    On 2021-10-04 10:29, James Harris wrote:

    ...

    I'm not sure I agree with any of that! Integer arithmetic does not
    generate approximations.

    Approximation means that the model is inexact, integers are
    incomputable, so you must give up something, typically the range.

    The case of real numbers is more showing. There is the fixed-point
    model with exact addition and multiplication and there is the
    floating-point model with all operations inexact and even
    non-associative.

    Non-associativity can also apply to integers. Consider

       A + B + C

    Even in such a simple expression if overflow of intermediate results
    is to be detected then (A + B) + C is not A + (B + C).

    No, overflow is outside the model. Inside the model computer integer
    arithmetic is associative.

    Then the model is inadequate - and that's party my point. Computers do
    not implement the normal mathematical model for integers, but a subset
    and a superset thereof, even for something as simple as addition of
    integers.

    Well, mathematics cheats a little. Because A + B in maths is just:

    A + B

    It usually doesn't need to evaluate it, so overflow is irrelevant!

    But suppose, given some concrete values for A and B, it DID need to
    evaluate A + B into a concrete result.

    Who or what does that a calculation: a human? a machine? Whatever actual physical thing is used, may well have the same limitations as a
    programming language.

    To me that's the key difference between maths equations that just look
    pretty, and code that is actually executed. In the latter you have to
    make some compromises.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to James Harris on Tue Oct 5 20:30:09 2021
    On 05/10/2021 20:13, James Harris wrote:
    On 05/10/2021 18:35, David Brown wrote:

    ...

    You never came across functions like Dirac's delta function (very
    popular in engineering circles) :

         𝛿(x) = ⎧ +∞, if x = 0
                ⎨
                ⎩ 0, otherwise

    or

         abs(x) = ⎧ x, if x >= 0
                  ⎨
                  ⎩ -x, if x < 0

    ?

    I haven't read the messages which led to this one, yet, but this one
    caught my eye due to the graphics - the delta symbol and the clever
    extended bracing. Very impressive!

    But something else stood out particularly in the context of this
    subthread. The definition of abs(x) would fail on a computer if it were
    using 2's complement representation and x was the most negative number.
    It's a classic case in point where the computer won't follow the
    accepted mathematical definition.

    The -x operation would just be an overflow, and actually it would result
    in x unchanged.

    This can be documented behaviour, or it can restrict the allowable
    values of x.

    What else can it do? Runtime checking

    On my machine, it means that:

    abs(-9223372036854775808) = -9223372036854775808

    It seems to me that not being able to represent:

    -9223372036854775809

    at all is a bigger deal.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Tue Oct 5 22:01:56 2021
    On 2021-10-05 21:36, James Harris wrote:
    On 04/10/2021 11:26, Dmitry A. Kazakov wrote:
    On 2021-10-04 11:38, James Harris wrote:
    On 04/10/2021 10:15, Dmitry A. Kazakov wrote:
    On 2021-10-04 10:29, James Harris wrote:

    ...

    I'm not sure I agree with any of that! Integer arithmetic does not
    generate approximations.

    Approximation means that the model is inexact, integers are
    incomputable, so you must give up something, typically the range.

    The case of real numbers is more showing. There is the fixed-point
    model with exact addition and multiplication and there is the
    floating-point model with all operations inexact and even
    non-associative.

    Non-associativity can also apply to integers. Consider

       A + B + C

    Even in such a simple expression if overflow of intermediate results
    is to be detected then (A + B) + C is not A + (B + C).

    No, overflow is outside the model. Inside the model computer integer
    arithmetic is associative.

    Then the model is inadequate - and that's party my point.

    Only if you deploy it falsely. This is the core engineering. Solid
    mechanics and strength of materials is inadequate in general case, but perfectly well for building bridges.

    Do not overflow your numbers, OK?

    Computers do
    not implement the normal mathematical model for integers,

    They always do, within well-defined constraints and tolerances.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Tue Oct 5 20:55:03 2021
    On 04/10/2021 19:19, David Brown wrote:
    On 04/10/2021 10:39, James Harris wrote:
    On 03/10/2021 23:14, David Brown wrote:
    On 03/10/2021 22:20, James Harris wrote:

    ...

    1. Integer arithmetic where all values - including intermediate results >>>> - remain in range for the data type. In this, the computer implements
    normal mathematics.

    2. Integer arithmetic where either a result or an intermediate value
    does not fit in the range assigned. For these a decision has to be made >>>> (by hardware, by language or by compiler) as to what to do with the
    non-compliant value. As you say, there are various options but they have >>>> to be cast semantically in terms of "if this happens then do that"
    rather than following the normal rules of mathematics. Worse, exactly
    where the limits apply can even depend on implementation.


    And it is all defined mathematically.

    We are talking finite sets with partial operations (for C-style signed
    integers) or closed operations (for C-style unsigned integers), rather
    than infinite sets, but it is all mathematics.

    OK, then how would you define integer computing's

      A - B

    in terms of mathematics?


    Mathematics on standard integers doesn't define 1/0.  Mathematics on a
    finite set for C-style signed integers leaves a lot more values
    undefined on more operations.  It doesn't mean it is not mathematical.

    OK, then how would you define integer computing's

      A / B

    in terms of mathematics?

    No need to reply but I'd suggest to you that because of the limits of
    computer fixed representation both of those are much more complex than
    just 'mathematics'!


    I'd agree that they are somewhat complicated by the limited sizes of fixed-size types - but they are still just "mathematics".

    How about:


    These are good but I would dispute that they are mathematical! Comments
    below.


    1. For a given fixed size of computer integer, "A - B" is defined as the result of normal mathematical integer subtraction as long as that result
    fits within the type.

    (That's basically how C defines it, if you stick to "int" or ignore the promotion stuff.)

    As you say, that's only partial. Fine for a limited domain, though the
    domain would be hard to specify; and it's incomplete due to the limited
    domain.


    or

    2. "A - B" is defined as the result of normal mathematical integer subtraction, reduced modulo 2^n as necessary to fit in the range of the
    type.

    (That's how "gcc -fwrapv" defines it.)

    No mathematics that I am aware of has the concept of "to fit in the
    range of the type" but maybe you know different.



    or

    3. "A - B" is defined as the result of mid(int_min, A - B, int_max).

    That's an interesting one! I'm not sure what it means but it's
    definitely interesting. ;-)


    or

    4. "A - B" is defined as either the result of normal integer subtraction
    if that fits within the range of the type, or an exception condition otherwise.

    Again, "an exception condition" is surely not mathematics.

    I put it to you that you are thinking like an engineer and producing definitions which are suitable for engineering. That's the right thing
    to do, IMO, but it should be called engineering not mathematics.

    ...


    Unfortunately, many programming tutorials encourage programmers to
    simply assume that any value is 'large enough' and so will behave
    according to the normal rules of mathematics. But as everyone here
    knows, that is not always the case, as was shown in the example I saw
    discussed recently of

       255 + 1

    what that results in is decided by what I would call 'engineering', and >>>> not by the normal rules of mathematics.


    Engineering is about applying the mathematical (and perhaps physical,
    chemical, etc.) laws to practical situations.  An engineer who does not >>> understand that there is a mathematical basis for what they do is in the >>> wrong profession.  (I certainly don't mean that they should understand
    the mathematics involved - but they should understand that there /is/
    mathematics involved, and that the mathematics is what justifies the
    rules and calculations they apply.)


    Well, I would say that engineering includes being aware of and
    accommodating limits - including those limits where simple mathematics
    breaks down and no longer applies. YMMV.


    Mathematics doesn't break down. That's the point.


    I said "simple mathematics" breaks down.

    In fact, it breaks down so much that even something as simple as plain subtraction is better described by an algorithm.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Tue Oct 5 22:02:14 2021
    On 2021-10-05 21:13, James Harris wrote:
    On 05/10/2021 18:35, David Brown wrote:

    ...

    You never came across functions like Dirac's delta function (very
    popular in engineering circles) :

         𝛿(x) = ⎧ +∞, if x = 0
                ⎨
                ⎩ 0, otherwise

    or

         abs(x) = ⎧ x, if x >= 0
                  ⎨
                  ⎩ -x, if x < 0

    ?

    [...]

    But something else stood out particularly in the context of this
    subthread. The definition of abs(x) would fail on a computer if it were
    using 2's complement representation and x was the most negative number.

    You do not understand difference between definition and implementation? Definitions never fail.

    It's a classic case in point where the computer won't follow the
    accepted mathematical definition.

    Of course it will. You can easily verify it in any reasonable language.
    All of them follow the implementation principle, the result is either mathematically correct (within the model constraints and tolerance) or
    else some exceptional action happens, which could be an exception or
    some ideal non-numeric value like NaN.

    For purely educational purpose, I suggest you reading the classic book

    "Computer Methods for Mathematical Computations"

    by George Elmer Forsythe, Michael A. Malcolm and Cleve B. Moler. I have
    no hope in Bart, but it could open your eyes.

    The book begins with an instructive example of finding roots of a
    quadratic equation using the school formula. It shows how awful such an implementation would be and how to fix that.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Tue Oct 5 21:49:30 2021
    On 05/10/2021 21:02, Dmitry A. Kazakov wrote:
    On 2021-10-05 21:13, James Harris wrote:
    On 05/10/2021 18:35, David Brown wrote:

    ...

    You never came across functions like Dirac's delta function (very
    popular in engineering circles) :

         𝛿(x) = ⎧ +∞, if x = 0
                ⎨
                ⎩ 0, otherwise

    or

         abs(x) = ⎧ x, if x >= 0
                  ⎨
                  ⎩ -x, if x < 0

    ?

    [...]

    But something else stood out particularly in the context of this
    subthread. The definition of abs(x) would fail on a computer if it
    were using 2's complement representation and x was the most negative
    number.

    You do not understand difference between definition and implementation?

    I understand the difference very well. Better than you, it seems. ;-)


    Definitions never fail.

    How would /you/ mathematically define abs() on integers?



    It's a classic case in point where the computer won't follow the
    accepted mathematical definition.

    Of course it will. You can easily verify it in any reasonable language.
    All of them follow the implementation principle, the result is either mathematically correct (within the model constraints and tolerance) or
    else some exceptional action happens, which could be an exception or
    some ideal non-numeric value like NaN.

    For purely educational purpose, I suggest you reading the classic book

       "Computer Methods for Mathematical Computations"

    by George Elmer Forsythe, Michael A. Malcolm and Cleve B. Moler. I have
    no hope in Bart, but it could open your eyes.

    The book begins with an instructive example of finding roots of a
    quadratic equation using the school formula. It shows how awful such an implementation would be and how to fix that.


    Thanks for the recommendation. I have one 'borrowed' from archive.org
    for a limited time (one hour, I think) and am looking at it now. It
    appears to be the old-fashioned kind of book I like.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Tue Oct 5 22:16:09 2021
    On 2021-10-05 21:45, Bart wrote:
    On 05/10/2021 20:36, James Harris wrote:
    On 04/10/2021 11:26, Dmitry A. Kazakov wrote:
    On 2021-10-04 11:38, James Harris wrote:
    On 04/10/2021 10:15, Dmitry A. Kazakov wrote:
    On 2021-10-04 10:29, James Harris wrote:

    ...

    I'm not sure I agree with any of that! Integer arithmetic does not >>>>>> generate approximations.

    Approximation means that the model is inexact, integers are
    incomputable, so you must give up something, typically the range.

    The case of real numbers is more showing. There is the fixed-point
    model with exact addition and multiplication and there is the
    floating-point model with all operations inexact and even
    non-associative.

    Non-associativity can also apply to integers. Consider

       A + B + C

    Even in such a simple expression if overflow of intermediate results
    is to be detected then (A + B) + C is not A + (B + C).

    No, overflow is outside the model. Inside the model computer integer
    arithmetic is associative.

    Then the model is inadequate - and that's party my point. Computers do
    not implement the normal mathematical model for integers, but a subset
    and a superset thereof, even for something as simple as addition of
    integers.

    Well, mathematics cheats a little. Because A + B in maths is just:

       A + B

    It usually doesn't need to evaluate it, so overflow is irrelevant!

    Yet another miss of education. Begin with read on constructive
    mathematics (AKA constructivism).

    But suppose, given some concrete values for A and B, it DID need to
    evaluate A + B into a concrete result.

    Who or what does that a calculation: a human? a machine? Whatever actual physical thing is used, may well have the same limitations as a
    programming language.

    Ah, a fellow solipsist here! (:-))

    There is an old philosophic joke on the subject (there exist many
    variants of).

    A curious mind - A tree in the forest, does it exist when nobody
    watches?

    God - I watch it!

    The number Pi exist even when incomputable. One can say, that God
    computed Pi during seven days of creation. Now read on constructivism again.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Tue Oct 5 23:20:53 2021
    On 2021-10-05 22:49, James Harris wrote:
    On 05/10/2021 21:02, Dmitry A. Kazakov wrote:
    On 2021-10-05 21:13, James Harris wrote:
    On 05/10/2021 18:35, David Brown wrote:

    ...

    You never came across functions like Dirac's delta function (very
    popular in engineering circles) :

         𝛿(x) = ⎧ +∞, if x = 0
                ⎨
                ⎩ 0, otherwise

    or

         abs(x) = ⎧ x, if x >= 0
                  ⎨
                  ⎩ -x, if x < 0

    ?

    [...]

    But something else stood out particularly in the context of this
    subthread. The definition of abs(x) would fail on a computer if it
    were using 2's complement representation and x was the most negative
    number.

    You do not understand difference between definition and implementation?

    I understand the difference very well. Better than you, it seems. ;-)

    Definitions never fail.

    How would /you/ mathematically define abs() on integers?

    Exactly as David did. With quantifiers added:

    ∀x∊Z ∃|x|∊Z (|x|=x ∧ x>=0) ∨ (|x|=-x ∧ x<0)

    Quantifiers are a important part as they could be assumed wrongly, just
    like you did.

    Now, a *model* of Z as a closed range l,h∊Z

    [l,h] = {x | x∊Z ∧ x>=l ∧ x<=h}

    defines abs:[l,h]->{[l,h], Constraint_Error} as follows:

    ∀x∊[l,h] (|x| ∧ |x|∊[l,h]) ∨ (Constraint_Error ∧ |x|∉[l,h])

    Note, differently to |x| the result of abs(x) is not always numeric.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to Bart on Tue Oct 5 23:21:38 2021
    On 05/10/2021 20:45, Bart wrote:
    Well, mathematics cheats a little. Because A + B in maths is just:
       A + B
    It usually doesn't need to evaluate it, so overflow is irrelevant!
    But suppose, given some concrete values for A and B, it DID need to
    evaluate A + B into a concrete result.

    Others have partially addressed this point. But there are
    several other partial answers:

    (a) This [suitably generalised] is the entire purpose of the
    branch of mathematics called "numerical analysis"*. We
    needed to get concrete results long before we had computing
    machines. Dmitry alludes to the problem of solving
    quadratic equations, and the fact that the "exact" formula
    is typically a rotten way to evaluate the roots. When I
    used to teach this stuff, I gave dozens of similar examples.
    Be warned that old books on NA are very outdated; there is
    scarcely any algorithm in practical mathematics that has
    survived the computer revolution unscathed.

    (b) There are two new [FSVO!] branches of mathematics that
    address this area from a different perspective from NA;
    viz symbolic algebra* and the theory of computability. Both
    of these are a symbiosis of maths and CS; and you need to
    understand both subjects pretty well to make good use of
    either of these branches.

    (c) Although "traditional" maths is "static", there are other
    bits of maths that are more "dynamic". Again, Dmitri has
    already mentioned "constructivism". I would add the
    "surreal" numbers as an interesting example, esp given
    the connexion with [combinatorial] games.

    Who or what does that a calculation: a human? a machine? Whatever
    actual physical thing is used, may well have the same limitations as
    a programming language.
    To me that's the key difference between maths equations that just
    look pretty, and code that is actually executed. In the latter you
    have to make some compromises.

    Maths and code are different things. Maths that just looks
    pretty is not thereby good or useful maths [and the same applies to
    code, of course]. Incidentally, in applied maths you again commonly
    have to make compromises. In my astrophysics days, I used regularly
    to come across [single] equations that occupied more than a page-and-
    a-half. Before symbolic algebra packages, it was infeasible to deal
    directly with such equations, so the next step was always to say "OK,
    /this/ term is negligible compared with /that/ term, and if we assume rotational symmetry then ...", or similar.

    ____
    * Not directly relevant, but hereabouts I usually recommend "The
    SIAM 100-Digit Challenge", now getting somewhat dated but still
    extremely readable. This book describes ten problems, and the
    task of solving each to 10sf [thus 100 digits in total]. The
    book shows the incredible power of modern computers in solving
    real, practical problems accurately, esp as many competitors
    didn't stop at 10sf, but when on to solve these problems to
    thousands of sf with the aid of Maple or Matlab or other NA/
    symbolic package.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Vivaldi

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Wed Oct 6 00:04:18 2021
    On 05/10/2021 18:35, David Brown wrote:
    On 05/10/2021 17:47, Bart wrote:
    On 05/10/2021 15:13, David Brown wrote:

    or do you not understand what mathematics is?

    Maybe I don't. I stopped 'getting it' since I started being involved
    with computers.

    But you still feel qualified to argue that computing is not mathematics,
    or that computer operations are not mathematically defined?


    I don't know why everyone seems determined to question my credentials.

    I do my stuff without using maths or knowingly using it; I don't care so
    long as I get results.

    To me it just seems obvious and intuitive.

    But FWIW, I didn't study it beyond 'A' level pure maths at school
    (though getting a top grade in it), then I decided to do a CS degree
    rather than pursue mathematics further.

    I then forgot most of it, except for the subjects I needed, for which I
    had to go out and re-purchase the textbooks I'd thrown away, in order to
    get the necessary formulae.

    I did spend rather a lot of time in the early 80s programming 3D
    floating point graphics, on machines with no floating point, not even
    integer multiply and divide, for which I had to write emulation code
    (yeah, those Taylor series or whatever it was proved useful after all,
    but you also needed some ingenuity).

    I don't really care about anyone else's background, however, can I just
    ask: how many here having a go at me for my irreverent approach to
    mathematics, have actually coded anything like arbitrary procession
    floating point, and have incorporated it into a language?


    Different behaviours of, for example, signed integer arithmetic can have perfectly good mathematical definitions. In that sense it is not
    possible to call wrapping semantics, or undefined overflow semantics,
    "right" or "wrong" - they are both valid, and you can give clear
    mathematical definitions in both cases. You can use these definitions
    to prove things about the arithmetic, such as commutative laws. Some
    things can be proved about one version of the definition but not others
    - with C-style signed arithmetic, you can prove that "x + 1 > x" is
    always true.

    The arbitrary precision library I mentioned above is limited only by
    memory space and runtime.

    Which actually makes it harder to define or predict behaviour since it
    depends on environmental factors - and the user's patience.

    But for the sorts of, by comparison, miniscule values represented by i64
    and u64, those would never be a problem.

    So, I provide a choice: use efficient types if you know they will not
    overflow, or use a big-number type.

    A dislike of dynamic typing is not irrational - nor is a dislike of
    strong static typing. (And you really mean "weak" typing, rather than "dynamic" typing here.)

    It's usually strong. It's weak for equal/not-equal, but Python has that
    same behaviour: you can compare any two objects; the result will be
    false if not compatible.

    You never came across functions like Dirac's delta function (very
    popular in engineering circles) :

    𝛿(x) = ⎧ +∞, if x = 0

    ⎩ 0, otherwise

    That looks like a useful function!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Wed Oct 6 08:20:51 2021
    On 05/10/2021 21:13, James Harris wrote:
    On 05/10/2021 18:35, David Brown wrote:

    ...

    You never came across functions like Dirac's delta function (very
    popular in engineering circles) :

         𝛿(x) = ⎧ +∞, if x = 0
                ⎨
                ⎩ 0, otherwise

    or

         abs(x) = ⎧ x, if x >= 0
                  ⎨
                  ⎩ -x, if x < 0

    ?

    I haven't read the messages which led to this one, yet, but this one
    caught my eye due to the graphics - the delta symbol and the clever
    extended bracing. Very impressive!

    They are standard Unicode symbols. But they do let you make slightly
    nicer posts.


    But something else stood out particularly in the context of this
    subthread. The definition of abs(x) would fail on a computer if it were
    using 2's complement representation and x was the most negative number.
    It's a classic case in point where the computer won't follow the
    accepted mathematical definition.


    These were examples of function definitions with conditionals in common mathematics using real numbers - they cannot be implemented directly in computer code. If you want a mathematical definition of "abs" for fixed
    size integer types in a programming language, you must adapt it to a
    different mathematical definition that is suitable for the domain you
    are using (i.e., the input and output sets are the range of your
    computer type, rather than the real numbers). It is, however, still
    maths. Two possibilities for n-bit two's complement signed integers
    could be :

    abs(x) = ⎧ x, if x >= 0
    ⎨ -x, if x < 0 and x > int_min
    ⎩ int_min, if x = int_min

    or

    abs(x) = ⎧ x, if x >= 0
    ⎨ -x, if x < 0 and x > int_min
    ⎩ undefined, if x = int_min

    Both are good, solid mathematical definitions - and both can be
    implemented. They have slightly different characteristics, each with
    their pros and cons.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Wed Oct 6 09:14:53 2021
    On 2021-10-06 01:04, Bart wrote:

    You never came across functions like Dirac's delta function (very
    popular in engineering circles) :

         𝛿(x) = ⎧ +∞, if x = 0
                ⎨
                ⎩ 0, otherwise

    That looks like a useful function!

    Yes, it is the first lesson in practically any engineering course,
    Laplace's Z-transform etc.

    I still remember how it was taught in engineering while in a parallel
    course on mathematical analysis the lector explained that Dirac's
    function is bad mathematics (infinities etc) and how to do the same in a
    clean way.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Andy Walker on Wed Oct 6 11:32:06 2021
    On 05/10/2021 23:21, Andy Walker wrote:
    On 05/10/2021 20:45, Bart wrote:
    Well, mathematics cheats a little. Because A + B in maths is just:
        A + B
    It usually doesn't need to evaluate it, so overflow is irrelevant!
    But suppose, given some concrete values for A and B, it DID need to
    evaluate A + B into a concrete result.

        Others have partially addressed this point.  But there are
    several other partial answers:

      (a) This [suitably generalised] is the entire purpose of the
          branch of mathematics called "numerical analysis"*.  We
          needed to get concrete results long before we had computing
          machines.

    My approach to solving problems would be computation or trial and error
    rather than doing things analytically, for which I just don't have the
    ability.

    I sometimes like to solve puzzles, but I'm not good enough to do it
    manually, and don't have the maths skills to do it that way.

    So I use brute force, with a computer program, if I think it is
    practical in a reasonable time.

    Here's an example of a puzzle where you have to fit different pieces
    into an outlined grid; this shows one solution:

    https://github.com/sal55/langs/blob/master/delta.png

    (I've made the pieces different colours.)

    Is what I did to solve this (aside from designing and implementing the
    language used) maths? Not in my view, as I was trying to avoid using it.
    But apparently it was.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Wed Oct 6 12:08:52 2021
    On 06/10/2021 07:20, David Brown wrote:

    These were examples of function definitions with conditionals in common mathematics using real numbers - they cannot be implemented directly in computer code. If you want a mathematical definition of "abs" for fixed
    size integer types in a programming language, you must adapt it to a different mathematical definition that is suitable for the domain you
    are using (i.e., the input and output sets are the range of your
    computer type, rather than the real numbers). It is, however, still
    maths. Two possibilities for n-bit two's complement signed integers
    could be :

    abs(x) = ⎧ x, if x >= 0
    ⎨ -x, if x < 0 and x > int_min
    ⎩ int_min, if x = int_min

    or

    abs(x) = ⎧ x, if x >= 0
    ⎨ -x, if x < 0 and x > int_min
    ⎩ undefined, if x = int_min

    Both are good, solid mathematical definitions - and both can be
    implemented. They have slightly different characteristics, each with
    their pros and cons.

    These are somewhat unsatisfactory. I guess you only have one actual
    definition of abs()?

    In practice, it would be different for each different type of x. For
    example, the representation might be twos complement, or it might be
    signed magnitude [as used in floats].

    Further, there might be different sizes of int, so different values of
    int_min.

    Also, you might need to consider putting the check for int_min first, or
    at least second, depending on whether problems are anticipated with
    doing 'x > int_min' when x is negative.

    There is also a question over exactly what 'undefined' means: would it
    require abs() to return a sum-type now rather than an int? If so, abs()
    might need such a type as input too: abs(abs(x)).

    So, the reality is bit a more involved. But it depends on what the
    purpose of your mathematical definitions are: is it just to 'look
    pretty'; or is it informal user documentation; or would it actually be
    input to some compiler generator?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Wed Oct 6 12:53:54 2021
    On 2021-10-06 12:32, Bart wrote:
    On 05/10/2021 23:21, Andy Walker wrote:
    On 05/10/2021 20:45, Bart wrote:
    Well, mathematics cheats a little. Because A + B in maths is just:
        A + B
    It usually doesn't need to evaluate it, so overflow is irrelevant!
    But suppose, given some concrete values for A and B, it DID need to
    evaluate A + B into a concrete result.

         Others have partially addressed this point.  But there are
    several other partial answers:

       (a) This [suitably generalised] is the entire purpose of the
           branch of mathematics called "numerical analysis"*.  We
           needed to get concrete results long before we had computing
           machines.

    My approach to solving problems would be computation or trial and error rather than doing things analytically,

    https://www.mathscareers.org.uk/the-rice-and-chessboard-legend/

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to Bart on Wed Oct 6 14:46:36 2021
    On 06/10/2021 11:32, Bart wrote:
    I sometimes like to solve puzzles, but I'm not good enough to do it
    manually, and don't have the maths skills to do it that way.

    The maths skills required to solve puzzles are almost never
    very advanced. I find it hard to believe that anyone with A-level
    maths [final year of secondary education, for non-UK readers] would
    find the problems usually described as "puzzles" at all difficult
    in terms of the skills needed. I worked my way through most of the
    books by [eg] Dudeney, Loyd and Gardner long before A-level.

    So I use brute force, with a computer program, if I think it is
    practical in a reasonable time.

    Sure. So do we all [FSVO "we"]. I keep my hand in by
    tackling the "Project Euler" problems:

    https://projecteuler.net/archives

    The problems range from trivial to extremely difficult, and from
    really interesting to bafflingly boring, but there are over 700 to
    choose from. I set myself the self-imposed extra task that the
    program had to compile and run to completion in less than a second;
    not always achieved, and perhaps not always achievable, but I'm
    often surprised by how much can be done in a second. Part of the
    attraction is the balance between brute force and analysis.

    Here's an example of a puzzle [...].
    Is what I did to solve this (aside from designing and implementing
    the language used) maths? Not in my view, as I was trying to avoid
    using it. But apparently it was.

    You seem to think that there is some rigid dividing line
    between "maths" and "not-maths". Not so. It's recursive; so
    mathematics is what mathematicians do, and mathematicians are
    people who do mathematics. It unwinds in different ways depending
    on who [or what] you take as axioms; mathematicians don't agree
    on where the boundaries are; and there is no reason at all why
    some particular topic can't be maths /and/ CS /and/ engineering
    /and/ physics [for example]. As someone who has switched several
    times in my career [maths -> astronomy -> physics -> CS -> maths,
    in rather broad terms], I'm acutely aware that most of the things
    that interest me are not so easily classified.

    This was brought home to me in one of the first examiners'
    meetings I attended. A student doing "information theory" did a
    project on "Norman Church Architecture". By common consent, it
    was a very good piece of work, fully worth a very high mark as
    a project. But was it maths? She had classified windows and
    other architectural features and used that to classify churches
    by period, extracting information from the classification. The
    department split pretty-much down the middle as to whether this
    could count towards a maths degree. The external examiners were
    split, the professors were split, more junior lecturers were
    split. We couldn't decide, and almost came to blows. In the
    end, we took the view that it would be unfair to penalise the
    student; if what she had done was unacceptable, then her
    supervisor [who happened to be head of department] should have
    told her rather than letting her continue to submission. So
    "Norman Church Architecture" was deemed "officially" part of
    mathematics. For decades afterwards, this case was brought up
    at meetings where people were deciding what modules should be
    offered to our students, long after that HoD had retired.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Dussek

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Wed Oct 6 15:51:32 2021
    On 04/10/2021 10:50, David Brown wrote:
    On 04/10/2021 01:58, Bart wrote:
    On 03/10/2021 23:05, David Brown wrote:
    On 03/10/2021 20:27, Bart wrote:

    Processors are designed to do many things.  Exactly duplicating standard >>> mathematical integers is not one of those things.  Being usable to model >>> a limited version of those integers - following somewhat different
    mathematical rules and definitions - /is/ one of those things.

    No, it's just arithmetic with a limited number of digits. And usually in
    binary.

    It's engineering, not maths. But of course you can apply maths to anything.

    Engineering /is/ applied maths!

    /includes/

    !

    Engineering is a lot of things: science, materials, chemistry,
    mathematics, biology, etc.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Wed Oct 6 15:56:54 2021
    On 05/10/2021 21:01, Dmitry A. Kazakov wrote:
    On 2021-10-05 21:36, James Harris wrote:
    On 04/10/2021 11:26, Dmitry A. Kazakov wrote:
    On 2021-10-04 11:38, James Harris wrote:
    On 04/10/2021 10:15, Dmitry A. Kazakov wrote:
    On 2021-10-04 10:29, James Harris wrote:

    ...

    I'm not sure I agree with any of that! Integer arithmetic does not >>>>>> generate approximations.

    Approximation means that the model is inexact, integers are
    incomputable, so you must give up something, typically the range.

    The case of real numbers is more showing. There is the fixed-point
    model with exact addition and multiplication and there is the
    floating-point model with all operations inexact and even
    non-associative.

    Non-associativity can also apply to integers. Consider

       A + B + C

    Even in such a simple expression if overflow of intermediate results
    is to be detected then (A + B) + C is not A + (B + C).

    No, overflow is outside the model. Inside the model computer integer
    arithmetic is associative.

    Then the model is inadequate - and that's party my point.

    Only if you deploy it falsely. This is the core engineering. Solid
    mechanics and strength of materials is inadequate in general case, but perfectly well for building bridges.

    Do not overflow your numbers, OK?

    That's maths. Engineering includes how to respond to overflow (or the
    potential thereof).


    Computers do not implement the normal mathematical model for integers,

    They always do, within well-defined constraints and tolerances.


    Your having to qualify that illustrates what I have been saying.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Wed Oct 6 17:10:19 2021
    On 06/10/2021 01:04, Bart wrote:
    On 05/10/2021 18:35, David Brown wrote:
    On 05/10/2021 17:47, Bart wrote:
    On 05/10/2021 15:13, David Brown wrote:

    or do you not understand what mathematics is?

    Maybe I don't. I stopped 'getting it' since I started being involved
    with computers.

    But you still feel qualified to argue that computing is not mathematics,
    or that computer operations are not mathematically defined?


    I don't know why everyone seems determined to question my credentials.

    I do my stuff without using maths or knowingly using it; I don't care so
    long as I get results.


    That's okay for some things. It is not acceptable for others. Code
    does not become correct by passing some tests - but it might be good
    enough in some circumstances. (At the other end of the scale, you have development processes where you mathematically and logically prove the correctness of your code. Usually that is too much - few programming
    projects have the time or money budgets for that level.)

    I am not questioning your credentials in general - I know you have long experience and success in your programming. I am questioning your
    credentials in this one particular point. You have been offering your
    opinion on mathematics and how it relates to computer programming - I am
    trying to understand if that is an /informed/ opinion, based on
    knowledge, research, understanding and experience, or merely "gut
    feeling" about something you know little about. It seems to be the
    later here, which is a bit disappointing.

    People usually go through most of their lives not caring about things
    they don't understand and which don't seem to affect them. That's fine,
    of course - you'd never get anywhere if you tried to understand
    /everything/. You don't need to know the lifecycle of coffee beans,
    from growth, harvesting, transport, distribution, sales, etc., to enjoy
    a cup of coffee. But you /do/ need to understand some of it if you want
    anyone to take seriously your opinion that coffee costs ten times more
    than it should.


    To me it just seems obvious and intuitive.

    But FWIW, I didn't study it beyond 'A' level pure maths at school
    (though getting a top grade in it), then I decided to do a CS degree
    rather than pursue mathematics further.


    Fair enough.

    I then forgot most of it, except for the subjects I needed, for which I
    had to go out and re-purchase the textbooks I'd thrown away, in order to
    get the necessary formulae.

    I did spend rather a lot of time in the early 80s programming 3D
    floating point graphics, on machines with no floating point, not even
    integer multiply and divide, for which I had to write emulation code
    (yeah, those Taylor series or whatever it was proved useful after all,
    but you also needed some ingenuity).


    These kinds of numerical methods are useful - indeed, they are some of
    the most useful kinds of mathematics in a lot of programming. But that
    is all applied mathematics - it is using mathematical results in
    everyday life (well, everyday life for a small number of people!).

    When you use a Taylor series (yes, I know this is not the only way, or
    the best way, but that's irrelevant to the discussion) to calculate
    "sin(x)", how do you know the results are correct? My guess is that you checked a few values compared to what you got on your calculator or
    another program. Often that's absolutely fine. But if you want to be
    /sure/ that your results will be correct, or to understand the
    inaccuracies over different ranges, then you need to understand more
    about the underlying mathematics. You need to know how Taylor series
    work. That builds on calculus. How do you know calculus works? That
    builds on real or complex analysis. How do you know that works? It
    builds on set theory, algebra, and so on. There's a chain all the way
    down - you choose how far you want to go, and when you want to accept
    that others have done work that you can copy, or that testing is good
    enough. (You never try to go /all/ the way down - that way madness lies.)

    If you just take your Taylor series and use it, then often that is good
    enough. You can say "it works well enough for my needs". But you
    /can't/ say "this is as accurate is you can get", or "this is an
    efficient implementation", or "you can rely on this code regardless of
    the input value". And the bits you /can/ say, are reasonable because it
    all builds on top of provable mathematics.

    I don't really care about anyone else's background, however, can I just
    ask: how many here having a go at me for my irreverent approach to mathematics, have actually coded anything like arbitrary procession
    floating point, and have incorporated it into a language?


    It's not something I have ever had cause to do - and while it could be interesting, there are too many other interesting things in life. One
    thing that is not interesting, however, is a pissing competition about
    who has done what.

    The mathematics of making a straightforward arbitrary precision library, floating point or integer, are not particularly advanced - it's just
    simple arithmetic. Making a nice interface is hard. Making a system
    where it is easy to get right, hard to get wrong, and you don't end up
    with memory leaks is hard. Making efficient implementations is hard.
    Making algorithms that scale well is hard. Doing long multiplication
    and multi-digit addition is primary school arithmetic - you just have to realise that, and not be scared by the big numbers.

    So an arbitrary precision floating point library is a significant
    achievement - but /not/ because of the mathematics.

    If you have included error analysis and correctness proofs, then the
    maths gets hard. If you have included FFT multiplication algorithms,
    the maths gets hard. If you have included partitioning to support
    parallel implementations - with correctness proofs, of course - the
    maths gets hard.


    None of that is particularly related to the kind of mathematics that was
    under discussion here.


    Different behaviours of, for example, signed integer arithmetic can have
    perfectly good mathematical definitions.  In that sense it is not
    possible to call wrapping semantics, or undefined overflow semantics,
    "right" or "wrong" - they are both valid, and you can give clear
    mathematical definitions in both cases.  You can use these definitions
    to prove things about the arithmetic, such as commutative laws.  Some
    things can be proved about one version of the definition but not others
    - with C-style signed arithmetic, you can prove that "x + 1 > x" is
    always true.

    The arbitrary precision library I mentioned above is limited only by
    memory space and runtime.

    Which actually makes it harder to define or predict behaviour since it depends on environmental factors - and the user's patience.


    A system that is put together by "intuition" and tested to see that it
    appears to work, can be practically useful for many purposes. To be
    something that people can rely on for serious work in important code, independently from the library implementer, you need to /know/ that
    things are correct. That means someone /does/ have to make the proper mathematical definitions, and do the mathematical analysis. Someone has
    to specify the operations, then check that the implementation matches
    these. That also applies to the language itself (or at least the
    relevant parts of it).

    Yes, that's all hard stuff. And yes, you can make a "cheap and
    cheerful" system without bothering about being sure it is correct -
    especially when there is no dividing line between the library/language
    user and the author, and you can change the toolchain to fix any issues
    as you go along. And yes, such "cheap and cheerful" solutions do have real-world practical uses.

    But for the sorts of, by comparison, miniscule values represented by i64
    and u64, those would never be a problem.

    So, I provide a choice: use efficient types if you know they will not overflow, or use a big-number type.

    A dislike of dynamic typing is not irrational - nor is a dislike of
    strong static typing.  (And you really mean "weak" typing, rather than
    "dynamic" typing here.)

    It's usually strong. It's weak for equal/not-equal, but Python has that
    same behaviour: you can compare any two objects; the result will be
    false if not compatible.

    Fair enough.


    You never came across functions like Dirac's delta function (very
    popular in engineering circles) :

         𝛿(x) = ⎧ +∞, if x = 0
                ⎨
                ⎩ 0, otherwise

    That looks like a useful function!


    Yes, it's a great one - it's useful in many cases. It is the derivative
    of the step function, which is another useful function defined using conditionals. It is useful for modelling impulses, and in Fourier
    transforms.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to David Brown on Wed Oct 6 16:23:44 2021
    On 06/10/2021 16:10, David Brown wrote:
    You never came across functions like Dirac's delta function [...]
    [Bart:]
    That looks like a useful function!
    Yes, it's a great one - it's useful in many cases. It is the derivative
    of the step function, which is another useful function defined using conditionals. [...]

    I wonder whether this is the right place to point out to our
    readers that step functions are uncomputable? Of course, that could
    be the opportunity for some to decide that "uncomputable" is a daft
    concept; OTOH, no-one seems to have proposed anything better.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Dussek

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Wed Oct 6 17:38:07 2021
    On 2021-10-06 16:51, James Harris wrote:
    On 04/10/2021 10:50, David Brown wrote:

    Engineering /is/ applied maths!

    /includes/

    !

    Engineering is a lot of things: science, materials, chemistry,
    mathematics, biology, etc.

    Engineering is application of science for solving practical problems.

    Mathematics is a science and is a basis of all other sciences (as well
    as many pseudo-sciences).

    Ergo, the statement stands.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Wed Oct 6 16:37:09 2021
    On 06/10/2021 00:04, Bart wrote:
    On 05/10/2021 18:35, David Brown wrote:
    On 05/10/2021 17:47, Bart wrote:
    On 05/10/2021 15:13, David Brown wrote:

    or do you not understand what mathematics is?

    Maybe I don't. I stopped 'getting it' since I started being involved
    with computers.

    But you still feel qualified to argue that computing is not mathematics,
    or that computer operations are not mathematically defined?


    I don't know why everyone seems determined to question my credentials.

    Not everyone! IMO you are correct and they are wrong.

    I don't necessarily disagree with David. My point is that (largely
    because of fixed-size integers and representation issues) computers do
    not implement the simple mathematics that Dmitry seemed to suggest in
    his earlier post when he spoke about using natural numbers (which is, I
    think, where this subthread originated). David's point is that computer arithmetic can be mathematically defined. Those two statements are not
    actually in conflict.

    What I do think is wrong is other people insisting on their definitions
    when they just have a different viewpoint.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Wed Oct 6 16:26:25 2021
    On 06/10/2021 07:20, David Brown wrote:
    On 05/10/2021 21:13, James Harris wrote:
    On 05/10/2021 18:35, David Brown wrote:

    ...

         abs(x) = ⎧ x, if x >= 0
                  ⎨
                  ⎩ -x, if x < 0

    ...

    But something else stood out particularly in the context of this
    subthread. The definition of abs(x) would fail on a computer if it were
    using 2's complement representation and x was the most negative number.
    It's a classic case in point where the computer won't follow the
    accepted mathematical definition.


    These were examples of function definitions with conditionals in common mathematics using real numbers - they cannot be implemented directly in computer code. If you want a mathematical definition of "abs" for fixed
    size integer types in a programming language, you must adapt it to a different mathematical definition that is suitable for the domain you
    are using (i.e., the input and output sets are the range of your
    computer type, rather than the real numbers). It is, however, still
    maths. Two possibilities for n-bit two's complement signed integers
    could be :

    abs(x) = ⎧ x, if x >= 0
    ⎨ -x, if x < 0 and x > int_min
    ⎩ int_min, if x = int_min

    Yes. I would consider that a valid and correct definition given the
    criteria. It describes what a programmer can expect from a computer's
    abs function (again, given the criteria).

    I would add, however, that it describes something which is not the
    mathematical |x| or 'absolute value'. Instead, it /uses/ mathematics to describe what happens in different scenarios. But it does not implement
    a mathematical abs operation because a computer does not.

    Again, I don't think we disagree on the substance but the nomenclature
    so I don't see a need to pursue this further but I will append an anecdote.

    I remember a documentary about Charles Babbage in which he showed some
    dinner guests an early version of one of his machines. In the
    documentary Babbage had the machine generate a series of numbers but one
    of the numbers did not fit the mathematical series: for the presumed computation that value was mathematically incorrect. Babbage explained
    that that was the point: with his machine such step-outs could be
    configured by him as the machine's controller.

    Now, one could think of him as making an excuse for an incorrect
    computation but presuming he really did mean that to happen I see it as
    similar to the abs(int_min) case: a step-out from mathematics put in by
    an engineer.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Wed Oct 6 17:39:47 2021
    On 2021-10-06 16:56, James Harris wrote:
    On 05/10/2021 21:01, Dmitry A. Kazakov wrote:
    On 2021-10-05 21:36, James Harris wrote:
    On 04/10/2021 11:26, Dmitry A. Kazakov wrote:
    On 2021-10-04 11:38, James Harris wrote:
    On 04/10/2021 10:15, Dmitry A. Kazakov wrote:
    On 2021-10-04 10:29, James Harris wrote:

    ...

    I'm not sure I agree with any of that! Integer arithmetic does
    not generate approximations.

    Approximation means that the model is inexact, integers are
    incomputable, so you must give up something, typically the range.

    The case of real numbers is more showing. There is the fixed-point >>>>>> model with exact addition and multiplication and there is the
    floating-point model with all operations inexact and even
    non-associative.

    Non-associativity can also apply to integers. Consider

       A + B + C

    Even in such a simple expression if overflow of intermediate
    results is to be detected then (A + B) + C is not A + (B + C).

    No, overflow is outside the model. Inside the model computer integer
    arithmetic is associative.

    Then the model is inadequate - and that's party my point.

    Only if you deploy it falsely. This is the core engineering. Solid
    mechanics and strength of materials is inadequate in general case, but
    perfectly well for building bridges.

    Do not overflow your numbers, OK?

    That's maths. Engineering includes how to respond to overflow (or the potential thereof).

    No, engineering is how to *avoid* overflows.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Wed Oct 6 17:11:45 2021
    On 05/09/2021 10:39, Dmitry A. Kazakov wrote:
    On 2021-09-05 10:54, David Brown wrote:
    On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
    On 2021-09-04 15:39, David Brown wrote:

    Ada also solves this kind of problem by not allowing comparisons
    between
    different types.  (I don't know how it handles literals - that's beyond >>>> my rather limited knowledge of the language.)

    When operations can be overloaded in the result type that simplifies a
    lot. Literals are semantically overloaded parameterless functions. 1
    Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1 My_Custom_Integer
    etc.


    I'm not very keen on overloading in the result type - it feels to me
    that it would be too easy to lose track of what is going on, and to too
    easy to have code that appears identical (same expression, same
    variables, same types, etc.) but completely different effects.

    Why:

       declare
          X : T;
          Y : S;
       begin
          Foo (X);
          Foo (Y);

    is OK, but

       declare
          X : T := Create;
          Y : S := Create;
       begin

    is not?

    Assuming that Create is the same as Create() contrast

    T := Create;

    with

    T := Create + 0;

    Why should the former work and the latter, presumably, fail?


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Wed Oct 6 18:43:30 2021
    On 2021-10-06 18:15, James Harris wrote:
    On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
    On 2021-10-06 16:56, James Harris wrote:
    On 05/10/2021 21:01, Dmitry A. Kazakov wrote:
    On 2021-10-05 21:36, James Harris wrote:
    On 04/10/2021 11:26, Dmitry A. Kazakov wrote:


    No, overflow is outside the model. Inside the model computer
    integer arithmetic is associative.

    Then the model is inadequate - and that's party my point.

    Only if you deploy it falsely. This is the core engineering. Solid
    mechanics and strength of materials is inadequate in general case,
    but perfectly well for building bridges.

    Do not overflow your numbers, OK?

    That's maths. Engineering includes how to respond to overflow (or the
    potential thereof).

    No, engineering is how to *avoid* overflows.


    Fine, Dmitry. You try to write code to avoid

      A * B

    overflowing before you execute the multiply.

    Here is how it is done:

    1. The problem domain. The numeric type is selected from there. E.g.
    typically A is a measurement of something you know the range of.

    2. The algorithm. The formula A*B is a part of some larger algorithm.
    E.g. some iterative approximation etc. Here comes the mathematics. Most
    of good algorithms allow estimations of the upper and lover bounds. It
    is not that difficult, the really difficult part is rounding errors
    analysis. You may have no overflows, but the result is garbage.

    So from #1 and #2 you know the maximum range of the intermediates and
    declare the corresponding type. In Ada one would take that type for A
    and B but also constrain them to the domain's range to prevent wrong inputs.

    That is the method most people would use.

    A more advanced but also difficult approach is static analysis. You
    could prove that no overflows happens. Usually it requires
    transformation of the algorithm, because automatic provers have serious limitations. For Ada there is such a framework, the SPARK Ada.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Wed Oct 6 17:17:31 2021
    On 06/10/2021 16:38, Dmitry A. Kazakov wrote:
    On 2021-10-06 16:51, James Harris wrote:
    On 04/10/2021 10:50, David Brown wrote:

    Engineering /is/ applied maths!

    /includes/

    !

    Engineering is a lot of things: science, materials, chemistry,
    mathematics, biology, etc.

    Engineering is application of science for solving practical problems.

    Mathematics is a science and is a basis of all other sciences (as well
    as many pseudo-sciences).

    Ergo, the statement stands.


    I think we'll just have to disagree on this. Fortunately, nomenclature
    is just nomenclature.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Wed Oct 6 17:15:59 2021
    On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
    On 2021-10-06 16:56, James Harris wrote:
    On 05/10/2021 21:01, Dmitry A. Kazakov wrote:
    On 2021-10-05 21:36, James Harris wrote:
    On 04/10/2021 11:26, Dmitry A. Kazakov wrote:


    No, overflow is outside the model. Inside the model computer
    integer arithmetic is associative.

    Then the model is inadequate - and that's party my point.

    Only if you deploy it falsely. This is the core engineering. Solid
    mechanics and strength of materials is inadequate in general case,
    but perfectly well for building bridges.

    Do not overflow your numbers, OK?

    That's maths. Engineering includes how to respond to overflow (or the
    potential thereof).

    No, engineering is how to *avoid* overflows.


    Fine, Dmitry. You try to write code to avoid

    A * B

    overflowing before you execute the multiply.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Wed Oct 6 18:49:17 2021
    On 2021-10-06 18:11, James Harris wrote:
    On 05/09/2021 10:39, Dmitry A. Kazakov wrote:
    On 2021-09-05 10:54, David Brown wrote:
    On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
    On 2021-09-04 15:39, David Brown wrote:

    Ada also solves this kind of problem by not allowing comparisons
    between
    different types.  (I don't know how it handles literals - that's
    beyond
    my rather limited knowledge of the language.)

    When operations can be overloaded in the result type that simplifies a >>>> lot. Literals are semantically overloaded parameterless functions. 1
    Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1
    My_Custom_Integer etc.


    I'm not very keen on overloading in the result type - it feels to me
    that it would be too easy to lose track of what is going on, and to too
    easy to have code that appears identical (same expression, same
    variables, same types, etc.) but completely different effects.

    Why:

        declare
           X : T;
           Y : S;
        begin
           Foo (X);
           Foo (Y);

    is OK, but

        declare
           X : T := Create;
           Y : S := Create;
        begin

    is not?

    Assuming that Create is the same as Create() contrast

      T := Create;

    with

      T := Create + 0;

    Why should the former work and the latter, presumably, fail?

    It would not. I assume that T is an integer or modular type. So, if

    T := T + 0;

    does not fail

    T := Create + 0;

    would not fail either. Here is a complete example:

    type T is range 0..100;
    function Create return T is
    begin
    return 0;
    end Create;

    type S is range -100..200;
    function Create return S is
    begin
    return 0;
    end Create;

    X : T := Create + 0;
    Y : S := Create + 0;

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Wed Oct 6 18:13:48 2021
    On 06/10/2021 17:43, Dmitry A. Kazakov wrote:
    On 2021-10-06 18:15, James Harris wrote:
    On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
    On 2021-10-06 16:56, James Harris wrote:
    On 05/10/2021 21:01, Dmitry A. Kazakov wrote:
    On 2021-10-05 21:36, James Harris wrote:
    On 04/10/2021 11:26, Dmitry A. Kazakov wrote:


    No, overflow is outside the model. Inside the model computer
    integer arithmetic is associative.

    Then the model is inadequate - and that's party my point.

    Only if you deploy it falsely. This is the core engineering. Solid
    mechanics and strength of materials is inadequate in general case,
    but perfectly well for building bridges.

    Do not overflow your numbers, OK?

    That's maths. Engineering includes how to respond to overflow (or
    the potential thereof).

    No, engineering is how to *avoid* overflows.


    Fine, Dmitry. You try to write code to avoid

       A * B

    overflowing before you execute the multiply.

    Here is how it is done:

    1. The problem domain. The numeric type is selected from there. E.g. typically A is a measurement of something you know the range of.

    2. The algorithm. The formula A*B is a part of some larger algorithm.
    E.g. some iterative approximation etc. Here comes the mathematics. Most
    of good algorithms allow estimations of the upper and lover bounds. It
    is not that difficult, the really difficult part is rounding errors
    analysis. You may have no overflows, but the result is garbage.

    So from #1 and #2 you know the maximum range of the intermediates and
    declare the corresponding type. In Ada one would take that type for A
    and B but also constrain them to the domain's range to prevent wrong
    inputs.

    That is the method most people would use.

    A more advanced but also difficult approach is static analysis. You
    could prove that no overflows happens. Usually it requires
    transformation of the algorithm, because automatic provers have serious limitations. For Ada there is such a framework, the SPARK Ada.

    OK, I had in mind a different problem but I see what you are thinking of
    and agree with it.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Wed Oct 6 18:28:48 2021
    On 06/10/2021 16:10, David Brown wrote:
    On 06/10/2021 01:04, Bart wrote:

    It's not something I have ever had cause to do - and while it could be interesting, there are too many other interesting things in life. One
    thing that is not interesting, however, is a pissing competition about
    who has done what.

    But a pissing contest about who knows more maths is OK....

    Then using that smug position of superiority to belittle anyone else's
    opinions about language design and cast aspersions on the validity and
    quality of what they might have achieved.


    The mathematics of making a straightforward arbitrary precision library, floating point or integer, are not particularly advanced - it's just
    simple arithmetic. Making a nice interface is hard. Making a system
    where it is easy to get right, hard to get wrong, and you don't end up
    with memory leaks is hard. Making efficient implementations is hard.
    Making algorithms that scale well is hard. Doing long multiplication
    and multi-digit addition is primary school arithmetic - you just have to realise that, and not be scared by the big numbers.

    So an arbitrary precision floating point library is a significant
    achievement - but /not/ because of the mathematics.

    Exactly, there's lots of aspects involved, and that is why I do.

    If you have included error analysis and correctness proofs, then the
    maths gets hard. If you have included FFT multiplication algorithms,
    the maths gets hard. If you have included partitioning to support
    parallel implementations - with correctness proofs, of course - the
    maths gets hard.

    My job is to primarily provide + - * / % rem neg abs = <> < <= >= >
    basic ops, in a simple-to-use manner inside a language:

    C:\qx>qq -p:"print abs(-infinity)"
    infinity

    (Here, a well-behaved abs function that works on the largest (smallest?) negative number!)

    Most other things can be built on top. Things like trig functions aren't
    so easy because Taylor series don't converge quickly enough for extreme precision.


    None of that is particularly related to the kind of mathematics that was under discussion here.

    Oh. I was called out for choosing what topics I would classify as 'doing
    maths' or not, but apparently /you/ can do that.


    Yes, that's all hard stuff. And yes, you can make a "cheap and
    cheerful" system without bothering about being sure it is correct - especially when there is no dividing line between the library/language
    user and the author, and you can change the toolchain to fix any issues
    as you go along. And yes, such "cheap and cheerful" solutions do have real-world practical uses.

    Yes, they can be proof-of-concept for new ideas.

    The last project I completed was a protest against the enormity of LLVM,
    which is usually a 1600MB installation (which I think needs another big download to use properly, full of C and C++ headers or something.).

    It's a single executable, smaller than 0.25MB, that does, as I far as I
    am concerned, the same job, except:

    * It's 1/6000th the size
    * It builds from source 100,000 times faster (0.1 seconds versus
    estimates of hours on my machine, if I even knew how)
    * It might generate code that runs in typically 50-100% more time.

    It was created primarily for my own curiosity and satisfaction. But it
    can show anyone else what can be done at a small scale and by one
    person. (There are few other like products, but C- and Linux-centric.)

    (https://github.com/sal55/langs/tree/master/pcl)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Wed Oct 6 18:26:08 2021
    On 06/10/2021 17:49, Dmitry A. Kazakov wrote:
    On 2021-10-06 18:11, James Harris wrote:
    On 05/09/2021 10:39, Dmitry A. Kazakov wrote:

    ...

    Why:

        declare
           X : T;
           Y : S;
        begin
           Foo (X);
           Foo (Y);

    is OK, but

        declare
           X : T := Create;
           Y : S := Create;
        begin

    is not?

    Assuming that Create is the same as Create() contrast

       T := Create;

    with

       T := Create + 0;

    Why should the former work and the latter, presumably, fail?

    It would not. I assume that T is an integer or modular type. So, if

    Oops, I misread

    X : T := Create;

    as making T the object. I should have referred to X rather than T.


      T := T + 0;

    does not fail

      T := Create + 0;

    would not fail either. Here is a complete example:

       type T is range 0..100;
       function Create return T is
       begin
          return 0;
       end Create;

       type S is range -100..200;
       function Create return S is
       begin
          return 0;
       end Create;

       X : T := Create + 0;
       Y : S := Create + 0;

    Trying to understand that I gather that

    Create + 0

    would be seen to result in an object of type T because that's what it
    would be assigned to. What about

    X := 0 + 1 * Create * 2 + 3;

    Does the type of X effectively propagate through the constants so that
    Ada knows which version of Create to call?

    FWIW, I would have to write your original code as

    X = T.Create()

    where T provides the type wherein the correct Create function is found
    as I am not (yet) planning to use the output parameter's type in
    function selection. That seems simple enough.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Wed Oct 6 20:16:25 2021
    On 2021-10-06 19:26, James Harris wrote:

    What about

      X := 0 + 1 * Create * 2 + 3;

    Does the type of X effectively propagate through the constants so that
    Ada knows which version of Create to call?

    Yes, you have a list of possible interpretations of all overloaded terms
    and all overloaded operations. In a language without overloading of the
    result only + and * can be overloaded. In Ada 0, 1, 2, 3, Create are
    overloaded too. Which is why you do not need endings in the numeric
    literals as in C.

    So you would have the AST

    ":=":T:=T|S:=S|Integer:=Integer|...
    / \
    X:T "+" :T+T->T|S+S->S|Integer+Integer->Integer|...
    / \
    "0":T|S|Integer|...

    You walk the tree and start filter out lists until you get none, single,
    or multiple interpretations.

    FWIW, I would have to write your original code as

      X = T.Create()

    Yes, you can specify the expected type of expression in Ada:

    T'(Create)

    That is a way to resolve expressions which would be ambiguous otherwise.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Andy Walker on Wed Oct 6 19:32:00 2021
    On 06/10/2021 14:46, Andy Walker wrote:
    On 06/10/2021 11:32, Bart wrote:
    I sometimes like to solve puzzles, but I'm not good enough to do it
    manually, and don't have the maths skills to do it that way.

        The maths skills required to solve puzzles are almost never
    very advanced.  I find it hard to believe that anyone with A-level
    maths [final year of secondary education, for non-UK readers] would
    find the problems usually described as "puzzles" at all difficult
    in terms of the skills needed.

    Maybe you've forgotten the boring topics that constituted A-level maths
    in the 70s. Little would have been any help at all with the more
    advanced puzzles.

      I worked my way through most of the
    books by [eg] Dudeney, Loyd and Gardner long before A-level.

    I had some of those (don't know Loyd though), I can't remember how hard
    they were, except that Dudeney especially seemed to be for fun.


    So I use brute force, with a computer program, if I think it is
    practical in a reasonable time.

        Sure.  So do we all [FSVO "we"].

    I don't think that's the usual reason for setting the puzzle; you're
    suppose to solve it by being clever, not cheating! Or least, not dumbly
    trying every possible combination until you hit the right answer; that's
    not playing the game.

    I do it because I'm hopeless at puzzles anyway. And I found it more
    interesting to devise a solver.

    I keep my hand in by
    tackling the "Project Euler" problems:

      https://projecteuler.net/archives

    The problems range from trivial to extremely difficult, and from
    really interesting to bafflingly boring, but there are over 700 to
    choose from.

    Problem #413 was discussed on comp.lang.c recently. (See 'Losing my
    mind' thread from around 2nd July.)

    I have to say that finding a way to solve it via a computer within the one-minute limit was far beyond my abilities.

        You seem to think that there is some rigid dividing line
    between "maths" and "not-maths".  Not so.

    What I have in mind is the complicated stuff, the jargon, the proofs,
    basically everything that's beyond me.

    Then I don't really want to be reminded of my shortcomings.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to Bart on Thu Oct 7 00:22:49 2021
    On 06/10/2021 19:32, Bart wrote:
    I sometimes like to solve puzzles, but I'm not good enough to do it
    manually, and don't have the maths skills to do it that way.
         The maths skills required to solve puzzles are almost never
    very advanced.  I find it hard to believe that anyone with A-level
    maths [final year of secondary education, for non-UK readers] would
    find the problems usually described as "puzzles" at all difficult
    in terms of the skills needed.
    Maybe you've forgotten the boring topics that constituted A-level
    maths in the 70s. Little would have been any help at all with the
    more advanced puzzles.

    That was more-or-less my point. A-level maths is not
    designed for mathematicians, but for practical use by engineers,
    physicists, chemists, economists, CS people, .... Interesting
    maths starts about six months into a university maths course.
    Sorry about that, but it's driven by the needs of all these
    other people and "quarts into pint pots" and all that jazz. In
    particular, A-level maths is not maths for puzzles, so puzzles
    are usually much more elementary than A-level in terms of the
    maths required.

    I worked my way through most of the books by [eg] Dudeney, Loyd and
    Gardner long before A-level.
    I had some of those (don't know Loyd though), I can't remember how
    hard they were, except that Dudeney especially seemed to be for fun.

    Loyd is Dudeney for the American audience. They were
    deadly rivals, always accusing each other of filching ideas.

    So I use brute force, with a computer program, if I think it is
    practical in a reasonable time.
         Sure.  So do we all [FSVO "we"].
    I don't think that's the usual reason for setting the puzzle; you're
    suppose to solve it by being clever, not cheating! Or least, not
    dumbly trying every possible combination until you hit the right
    answer; that's not playing the game.

    It's also not fun or interesting. If you can spot a neat
    trick, /that/ makes a puzzle interesting. If not, then, as you
    have been claiming, writing a program to do the hard work is also
    fun and interesting, at least to some of us.

    [...]
       https://projecteuler.net/archives
    The problems range from trivial to extremely difficult, and from
    really interesting to bafflingly boring, but there are over 700 to
    choose from.
    Problem #413 was discussed on comp.lang.c recently. (See 'Losing my
    mind' thread from around 2nd July.)

    I gave up on "c.l.c" long ago; also hadn't [yet] looked at
    problem 413.

    I have to say that finding a way to solve it via a computer within
    the one-minute limit was far beyond my abilities.

    My [self-imposed] limit, if that's what you are referring
    to, is one /second/. On a quick glance, I don't know whether it's
    possible to do it that efficiently on my home PC, but for sure you
    would need to do better than [schematically]

    count := 0
    for i to 10^19 do ( onechild(i) | count +:= 1 ) od
    print count

    You would need ways of eliminating great swathes of numbers very
    quickly [eg any number with two zeros can be discounted as there
    are two "children" right there]. I may give it some thought, but
    it's not a problem that has caught my interest immediately.

         You seem to think that there is some rigid dividing line
    between "maths" and "not-maths".  Not so.
    What I have in mind is the complicated stuff, the jargon, the proofs, basically everything that's beyond me.

    Every discipline, without exception, has its own jargon
    and its own complexities, which you can't expect to understand
    without working at it [which is obviously not worthwhile for
    things you don't particularly care about]. As for "proofs",
    that's something that bothers actual mathematicians almost not
    at all. People get hung up on the word. But in any case,
    that's not what you were previously complaining about, which
    was being told that some things were maths that hadn't struck
    you that way previously.

    Then I don't really want to be reminded of my shortcomings.

    You've been letting Dmitri get to you!

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Dussek

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Thu Oct 7 08:32:08 2021
    On 06/10/2021 16:51, James Harris wrote:
    On 04/10/2021 10:50, David Brown wrote:
    On 04/10/2021 01:58, Bart wrote:
    On 03/10/2021 23:05, David Brown wrote:
    On 03/10/2021 20:27, Bart wrote:

    Processors are designed to do many things.  Exactly duplicating
    standard
    mathematical integers is not one of those things.  Being usable to
    model
    a limited version of those integers - following somewhat different
    mathematical rules and definitions - /is/ one of those things.

    No, it's just arithmetic with a limited number of digits. And usually in >>> binary.

    It's engineering, not maths. But of course you can apply maths to
    anything.

    Engineering /is/ applied maths!

    /includes/

    !

    Engineering is a lot of things: science, materials, chemistry,
    mathematics, biology, etc.


    Fair point!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Thu Oct 7 12:52:36 2021
    On 06/10/2021 18:15, James Harris wrote:
    On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
    On 2021-10-06 16:56, James Harris wrote:
    On 05/10/2021 21:01, Dmitry A. Kazakov wrote:
    On 2021-10-05 21:36, James Harris wrote:
    On 04/10/2021 11:26, Dmitry A. Kazakov wrote:


    No, overflow is outside the model. Inside the model computer
    integer arithmetic is associative.

    Then the model is inadequate - and that's party my point.

    Only if you deploy it falsely. This is the core engineering. Solid
    mechanics and strength of materials is inadequate in general case,
    but perfectly well for building bridges.

    Do not overflow your numbers, OK?

    That's maths. Engineering includes how to respond to overflow (or the
    potential thereof).

    No, engineering is how to *avoid* overflows.


    Agreed.


    Fine, Dmitry. You try to write code to avoid

      A * B

    overflowing before you execute the multiply.


    Would it make sense to ask a baker to "mix two ingredients" ? The baker
    would want to know what they are, what the quantities are, perhaps how
    well they need to be mixed. The same applies in programming. It makes
    no sense to say "take two things and multiply them, avoiding overflow"
    with no concept of what the things are, what types, what ranges within
    those types, what outputs you want, what language, and so on.

    The answer could be "there's no way it could ever overflow", or "use __builtin_mul_overflow", or "use a bigger type", or many other
    possibilities.

    In the real world, in real programming tasks, you usually have a lot
    more information than just "a number". Maybe "A" is the number of kids
    in a school class and "B" is the number of bikes they own - you can
    avoid overflow by using 32-bit integers and multiplying because you
    already know there are not two billion bikes in the class.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Thu Oct 7 13:11:55 2021
    On 06/10/2021 13:08, Bart wrote:
    On 06/10/2021 07:20, David Brown wrote:

    These were examples of function definitions with conditionals in common
    mathematics using real numbers - they cannot be implemented directly in
    computer code.  If you want a mathematical definition of "abs" for fixed
    size integer types in a programming language, you must adapt it to a
    different mathematical definition that is suitable for the domain you
    are using (i.e., the input and output sets are the range of your
    computer type, rather than the real numbers).  It is, however, still
    maths.  Two possibilities for n-bit two's complement signed integers
    could be :

           abs(x) = ⎧ x, if x >= 0
                    ⎨ -x, if x < 0 and x > int_min
                    ⎩ int_min, if x = int_min

    or

           abs(x) = ⎧ x, if x >= 0
                    ⎨ -x, if x < 0 and x > int_min
                    ⎩ undefined, if x = int_min

    Both are good, solid mathematical definitions - and both can be
    implemented.  They have slightly different characteristics, each with
    their pros and cons.

    These are somewhat unsatisfactory.

    In what way? The first matches some implementations of abs() for
    fixed-size signed integers, the second matches others (such as that of C).

    I guess you only have one actual
    definition of abs()?


    Why would you guess that? Different but equivalent definitions are
    common practice in mathematics. (For the standard mathematical version
    of "abs" over real numbers, you could use "abs(x) = √x²", or many other equivalent definitions.) Properly different definitions are also fine -
    there is seldom a monopoly on exactly what such functions should do,
    especially when viewed in different domains.

    Usually within one context (such as one language, or one program) you
    will want a single definition - at least for any given type.

    In practice, it would be different for each different type of x. For
    example, the representation might be twos complement, or it might be
    signed magnitude [as used in floats].


    That is a matter of representations and implementation. Here the
    definitions are for the values - the semantics of the operations. (Of
    course you can also make definitions that rely on representations too.)

    Further, there might be different sizes of int, so different values of int_min.

    Also, you might need to consider putting the check for int_min first, or
    at least second, depending on whether problems are anticipated with
    doing 'x > int_min' when x is negative.

    That's all implementation - not semantic definition.


    There is also a question over exactly what 'undefined' means: would it require abs() to return a sum-type now rather than an int? If so, abs()
    might need such a type as input too: abs(abs(x)).


    There are basically two ways to consider "undefined" in such
    definitions. One is to say that you are dealing with partial functions
    - you accept that there is not an output for each input. Functions
    don't have to be closed or complete, either in mathematics or computing.
    Alternatively, you can consider your input and output sets to be
    augmented with an "undefined" value in addition to the normal values.
    Both options work, and they have their advantages and disadvantages
    making them suitable in different circumstances.

    So, the reality is bit a more involved. But it depends on what the
    purpose of your mathematical definitions are: is it just to 'look
    pretty'; or is it informal user documentation; or would it actually be
    input to some compiler generator?


    They can certainly be used in code generation, such as for code
    optimisation and simplification. They can be used in proving code
    correctness (which can sometimes be done automatically). They can be
    used to reason about code and algorithms, or document them, or specify them.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Thu Oct 7 12:14:21 2021
    On 07/10/2021 11:52, David Brown wrote:
    On 06/10/2021 18:15, James Harris wrote:
    On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
    On 2021-10-06 16:56, James Harris wrote:
    On 05/10/2021 21:01, Dmitry A. Kazakov wrote:
    On 2021-10-05 21:36, James Harris wrote:
    On 04/10/2021 11:26, Dmitry A. Kazakov wrote:


    No, overflow is outside the model. Inside the model computer
    integer arithmetic is associative.

    Then the model is inadequate - and that's party my point.

    Only if you deploy it falsely. This is the core engineering. Solid
    mechanics and strength of materials is inadequate in general case,
    but perfectly well for building bridges.

    Do not overflow your numbers, OK?

    That's maths. Engineering includes how to respond to overflow (or the
    potential thereof).

    No, engineering is how to *avoid* overflows.


    Agreed.


    Fine, Dmitry. You try to write code to avoid

      A * B

    overflowing before you execute the multiply.


    Would it make sense to ask a baker to "mix two ingredients" ? The baker would want to know what they are, what the quantities are, perhaps how
    well they need to be mixed. The same applies in programming. It makes
    no sense to say "take two things and multiply them, avoiding overflow"
    with no concept of what the things are, what types, what ranges within
    those types, what outputs you want, what language, and so on.

    The answer could be "there's no way it could ever overflow", or "use __builtin_mul_overflow", or "use a bigger type", or many other
    possibilities.

    In the real world, in real programming tasks, you usually have a lot
    more information than just "a number". Maybe "A" is the number of kids
    in a school class and "B" is the number of bikes they own - you can
    avoid overflow by using 32-bit integers and multiplying because you
    already know there are not two billion bikes in the class.


    Real programs may also have to end up multiplying two values that are
    runtime inputs. They may not have any constraints either.

    Example: a calculator program. Or a compiler that reduces constant
    expressions.

    Then it is up to you what degree of quality of implementation you want
    to apply.

    To do something about it, you might need to look at what features the
    language provides. If none, then it is up to your application code.

    If you are creating the language to write the application (even the one
    used to write that compiler), then you have to decide how much more
    complicated and difficult you want both language and implementation, to
    solve what is in reality a minor problem.

    You can go to a LOT of trouble so that if someone types 2**3**4**5, it
    will do something sensible. That can mean diverting attention from much
    more productive matters.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Thu Oct 7 12:36:51 2021
    On 06/10/2021 12:32, Bart wrote:
    On 05/10/2021 23:21, Andy Walker wrote:
    On 05/10/2021 20:45, Bart wrote:
    Well, mathematics cheats a little. Because A + B in maths is just:
        A + B
    It usually doesn't need to evaluate it, so overflow is irrelevant!
    But suppose, given some concrete values for A and B, it DID need to
    evaluate A + B into a concrete result.

         Others have partially addressed this point.  But there are
    several other partial answers:

       (a) This [suitably generalised] is the entire purpose of the
           branch of mathematics called "numerical analysis"*.  We
           needed to get concrete results long before we had computing
           machines.

    My approach to solving problems would be computation or trial and error rather than doing things analytically, for which I just don't have the ability.

    I sometimes like to solve puzzles, but I'm not good enough to do it
    manually, and don't have the maths skills to do it that way.

    So I use brute force, with a computer program, if I think it is
    practical in a reasonable time.

    Sometimes it is nice to change the domain of the problem - you are
    turning it from one kind of puzzle into a different one (a coding
    challenge).


    Here's an example of a puzzle where you have to fit different pieces
    into an outlined grid; this shows one solution:

       https://github.com/sal55/langs/blob/master/delta.png

    (I've made the pieces different colours.)

    Is what I did to solve this (aside from designing and implementing the language used) maths? Not in my view, as I was trying to avoid using it.
    But apparently it was.

    The tools and techniques you used to solve the problem, are based on
    maths - even if you don't know it. They can be defined by maths, they
    rely on mathematical rules. It doesn't mean you actively use advanced
    maths when writing the code - just that the fact that you can write the
    code at all, and that it works as you expect, builds on maths and can be defined mathematically. Often you don't bother doing the work to define
    things mathematically, or prove them mathematically, but it's important
    that one /could/ do it. The lower down the chain, and the more people
    that use and rely on particular code or techniques, the more important
    it is to actually do that mathematics work. Writing a program to solve
    a puzzle is at the other end of that scale, and it's unlikely anyone
    would bother writing out the mathematics.

    As an example, your program might have called "qsort". Your code would therefore rely on that function sorting the data correctly. How can you
    be sure that the quicksort algorithm works correctly, in every case? It
    was proved mathematically.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Andy Walker on Thu Oct 7 13:35:32 2021
    On 06/10/2021 17:23, Andy Walker wrote:
    On 06/10/2021 16:10, David Brown wrote:
    You never came across functions like Dirac's delta function [...]
    [Bart:]
    That looks like a useful function!
    Yes, it's a great one - it's useful in many cases.  It is the derivative
    of the step function, which is another useful function defined using
    conditionals. [...]

        I wonder whether this is the right place to point out to our
    readers that step functions are uncomputable?  Of course, that could
    be the opportunity for some to decide that "uncomputable" is  a daft concept;  OTOH, no-one seems to have proposed anything better.


    Perhaps I should have used Kronecker's delta instead... :-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Thu Oct 7 13:32:48 2021
    On 06/10/2021 17:26, James Harris wrote:
    On 06/10/2021 07:20, David Brown wrote:
    On 05/10/2021 21:13, James Harris wrote:
    On 05/10/2021 18:35, David Brown wrote:

    ...

          abs(x) = ⎧ x, if x >= 0
                   ⎨
                   ⎩ -x, if x < 0

    ...

    But something else stood out particularly in the context of this
    subthread. The definition of abs(x) would fail on a computer if it were
    using 2's complement representation and x was the most negative number.
    It's a classic case in point where the computer won't follow the
    accepted mathematical definition.


    These were examples of function definitions with conditionals in common
    mathematics using real numbers - they cannot be implemented directly in
    computer code.  If you want a mathematical definition of "abs" for fixed
    size integer types in a programming language, you must adapt it to a
    different mathematical definition that is suitable for the domain you
    are using (i.e., the input and output sets are the range of your
    computer type, rather than the real numbers).  It is, however, still
    maths.  Two possibilities for n-bit two's complement signed integers
    could be :

           abs(x) = ⎧ x, if x >= 0
                    ⎨ -x, if x < 0 and x > int_min
                    ⎩ int_min, if x = int_min

    Yes. I would consider that a valid and correct definition given the
    criteria. It describes what a programmer can expect from a computer's
    abs function (again, given the criteria).

    What criteria? It describes the behaviour I would expect from a Java
    "abs" function, but not the behaviour I would expect from a C "abs".


    I would add, however, that it describes something which is not the mathematical |x| or 'absolute value'. Instead, it /uses/ mathematics to describe what happens in different scenarios. But it does not implement
    a mathematical abs operation because a computer does not.

    It is /a/ mathematical definition of /an/ abs function, defined on a
    specific subset of standard integers, designed to match the standard mathematical abs function for as many values as possible. It is not
    unique in that way - the other definition I gave with an undefined value
    also qualifies.


    Again, I don't think we disagree on the substance but the nomenclature
    so I don't see a need to pursue this further but I will append an anecdote.

    I remember a documentary about Charles Babbage in which he showed some
    dinner guests an early version of one of his machines. In the
    documentary Babbage had the machine generate a series of numbers but one
    of the numbers did not fit the mathematical series: for the presumed computation that value was mathematically incorrect. Babbage explained
    that that was the point: with his machine such step-outs could be
    configured by him as the machine's controller.

    Now, one could think of him as making an excuse for an incorrect
    computation but presuming he really did mean that to happen I see it as similar to the abs(int_min) case: a step-out from mathematics put in by
    an engineer.


    That is not a "step out" from /mathematics/ - it is a step out from the
    pattern and perhaps from what the user expects. In the definition of
    "abs" above, the case for "int_min" is clearly defined and specified in
    the mathematics - there is no "step out" there. But it is a glitch in
    the pattern, and stands out as far as the user is concerned. This is inevitable in a lot of computing - you can't use the common standard mathematical technique of using infinite sets to avoid endpoints or
    limits. When creating an "abs" function for a language, you have to
    think about the mathematical modules you are using for the numbers in
    your language and system, and how you want to treat these awkward points
    - what mathematical definitions you will use for them. Do you want to
    say that "abs(x)" has a well-defined result for any x? Or do you want
    to say that "abs(x) > 0" for all valid x? You don't get both. But
    whichever you choose, you can make a mathematical definition of the
    function.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Thu Oct 7 14:47:24 2021
    On 07/10/2021 13:14, Bart wrote:
    On 07/10/2021 11:52, David Brown wrote:
    On 06/10/2021 18:15, James Harris wrote:
    On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
    On 2021-10-06 16:56, James Harris wrote:
    On 05/10/2021 21:01, Dmitry A. Kazakov wrote:
    On 2021-10-05 21:36, James Harris wrote:
    On 04/10/2021 11:26, Dmitry A. Kazakov wrote:


    No, overflow is outside the model. Inside the model computer
    integer arithmetic is associative.

    Then the model is inadequate - and that's party my point.

    Only if you deploy it falsely. This is the core engineering. Solid >>>>>> mechanics and strength of materials is inadequate in general case, >>>>>> but perfectly well for building bridges.

    Do not overflow your numbers, OK?

    That's maths. Engineering includes how to respond to overflow (or the >>>>> potential thereof).

    No, engineering is how to *avoid* overflows.


    Agreed.


    Fine, Dmitry. You try to write code to avoid

       A * B

    overflowing before you execute the multiply.


    Would it make sense to ask a baker to "mix two ingredients" ?  The baker
    would want to know what they are, what the quantities are, perhaps how
    well they need to be mixed.  The same applies in programming.  It makes
    no sense to say "take two things and multiply them, avoiding overflow"
    with no concept of what the things are, what types, what ranges within
    those types, what outputs you want, what language, and so on.

    The answer could be "there's no way it could ever overflow", or "use
    __builtin_mul_overflow", or "use a bigger type", or many other
    possibilities.

    In the real world, in real programming tasks, you usually have a lot
    more information than just "a number".  Maybe "A" is the number of kids
    in a school class and "B" is the number of bikes they own - you can
    avoid overflow by using 32-bit integers and multiplying because you
    already know there are not two billion bikes in the class.


    Real programs may also have to end up multiplying two values that are
    runtime inputs. They may not have any constraints either.


    Such programs are simple, limited scripts or tools where the user and
    the developer are typically the same person, and the user can be relied
    on to give appropriate input each time. Or the program is broken by design.

    If you have a program that takes an input from an unknown source, and
    uses it without control, the program is broken. Rule one of software is
    to sanitise your inputs.

    Example: a calculator program. Or a compiler that reduces constant expressions.

    Then it is up to you what degree of quality of implementation you want
    to apply.

    You can check the inputs from outside and make sure they are
    appropriate, and handle them appropriately. Or you can write
    low-quality broken code. So yes, it is up to you what quality of coding
    you do.


    To do something about it, you might need to look at what features the language provides. If none, then it is up to your application code.


    Yes - if the language semantics don't cover all your needs in one
    operation, you need to write more code. The one thing you /don't/ do is pretend the language has different semantics and hope for the best.

    If you are creating the language to write the application (even the one
    used to write that compiler), then you have to decide how much more complicated and difficult you want both language and implementation, to
    solve what is in reality a minor problem.

    You can go to a LOT of trouble so that if someone types 2**3**4**5, it
    will do something sensible. That can mean diverting attention from much
    more productive matters.


    Sometimes "garbage in, garbage out" is perfectly reasonable. Sometimes "garbage in, user-friendly error message out" is more appropriate.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Thu Oct 7 14:38:23 2021
    On 06/10/2021 17:37, James Harris wrote:
    On 06/10/2021 00:04, Bart wrote:
    On 05/10/2021 18:35, David Brown wrote:
    On 05/10/2021 17:47, Bart wrote:
    On 05/10/2021 15:13, David Brown wrote:

    or do you not understand what mathematics is?

    Maybe I don't. I stopped 'getting it' since I started being involved
    with computers.

    But you still feel qualified to argue that computing is not mathematics, >>> or that computer operations are not mathematically defined?


    I don't know why everyone seems determined to question my credentials.

    Not everyone! IMO you are correct and they are wrong.

    I for one am only questioning /some/ of Bart's credentials in relation
    to /some/ of the things he has said, not as a general point. For many
    things in this group, his credentials include experience and
    achievements well beyond anything I have done in practice. But we all
    have occasions when we mistake our own subjective opinions for objective
    facts, or when we have strong opinions that are not based on knowledge
    or experience.


    I don't necessarily disagree with David. My point is that (largely
    because of fixed-size integers and representation issues) computers do
    not implement the simple mathematics that Dmitry seemed to suggest in
    his earlier post when he spoke about using natural numbers (which is, I think, where this subthread originated). David's point is that computer arithmetic can be mathematically defined. Those two statements are not actually in conflict.


    I think you put that rather well.

    What I do think is wrong is other people insisting on their definitions
    when they just have a different viewpoint.


    Yes, that can be a problem. (And I have been guilty of that more times
    than I'd like to admit.) But it can also be a problem (and I am not
    thinking of specific cases or specific people here) when people insist
    that their definitions or ideas are as valid as anyone else's even when
    they don't really know the subject or when they are outside the
    mainstream consensus.

    (Of course, sometimes the mainstream consensus is wrong. I found
    01.01.2000 a very difficult time - the mainstream opinion was that it
    was the start of the new millennium, whilst in reality it did not start
    until 01.01.2001. Things like that annoy me.)


    My favourite philosopher, Dara O'Briain, explains a bit about his
    opinion of the idea that everyone's opinion is equally valid:

    <https://www.youtube.com/watch?v=YMvMb90hem8>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to David Brown on Thu Oct 7 13:27:43 2021
    On 07/10/2021 11:52, David Brown wrote:
    On 06/10/2021 18:15, James Harris wrote:
    On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
    [... overflow ...]

    In the early days of computing, array-bound checks and
    overflow checks were automatic unless you actively turned them
    off. These days, you ought to be able to add [eg] use of
    uninitialised variables, use of storage after "free", following
    a null pointer, and probably others. It used to be important
    to be able to turn them off, as they dragged in extra code and
    extra time into limited storage and run-time. These days, such
    factors really, really don't impinge on almost all normal work.
    For all but a tiny proportion of work, time is dominated by
    disc/network transfers or waiting for the user to type/click,
    and space by large data structures rather than a bit of extra
    code -- esp when errors are detected by hardware interrupt
    rather than by user checks.

    Perhaps we should get back closer to the old days?
    If your program has a bug, would you rather that the program
    stops when the bug is first manifest, or that it continues
    until something more catastrophic happens? Or, even worse,
    that it continues but gives wrong results with no indication?
    How much developer/user time is wasted dealing with malware
    that couldn't exist in a safer environment?

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Joplin

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Thu Oct 7 15:02:36 2021
    On 05/10/2021 21:55, James Harris wrote:
    On 04/10/2021 19:19, David Brown wrote:
    On 04/10/2021 10:39, James Harris wrote:
    On 03/10/2021 23:14, David Brown wrote:
    On 03/10/2021 22:20, James Harris wrote:

    ...

    1. Integer arithmetic where all values - including intermediate
    results
    - remain in range for the data type. In this, the computer implements >>>>> normal mathematics.

    2. Integer arithmetic where either a result or an intermediate value >>>>> does not fit in the range assigned. For these a decision has to be
    made
    (by hardware, by language or by compiler) as to what to do with the
    non-compliant value. As you say, there are various options but they
    have
    to be cast semantically in terms of "if this happens then do that"
    rather than following the normal rules of mathematics. Worse, exactly >>>>> where the limits apply can even depend on implementation.


    And it is all defined mathematically.

    We are talking finite sets with partial operations (for C-style signed >>>> integers) or closed operations (for C-style unsigned integers), rather >>>> than infinite sets, but it is all mathematics.

    OK, then how would you define integer computing's

       A - B

    in terms of mathematics?


    Mathematics on standard integers doesn't define 1/0.  Mathematics on a >>>> finite set for C-style signed integers leaves a lot more values
    undefined on more operations.  It doesn't mean it is not mathematical. >>>
    OK, then how would you define integer computing's

       A / B

    in terms of mathematics?

    No need to reply but I'd suggest to you that because of the limits of
    computer fixed representation both of those are much more complex than
    just 'mathematics'!


    I'd agree that they are somewhat complicated by the limited sizes of
    fixed-size types - but they are still just "mathematics".

    How about:


    These are good but I would dispute that they are mathematical! Comments below.


    1. For a given fixed size of computer integer, "A - B" is defined as the
    result of normal mathematical integer subtraction as long as that result
    fits within the type.

    (That's basically how C defines it, if you stick to "int" or ignore the
    promotion stuff.)

    As you say, that's only partial. Fine for a limited domain, though the
    domain would be hard to specify; and it's incomplete due to the limited domain.


    A mathematical function is a mapping from one set to another set. A
    partial function does not map each input value to an output value. It
    can also be viewed as a single set of input-output pairs. That may be a convenient model when the set of valid inputs is complicated, or if you
    have multi-valued outputs.


    or

    2. "A - B" is defined as the result of normal mathematical integer
    subtraction, reduced modulo 2^n as necessary to fit in the range of the
    type.

    (That's how "gcc -fwrapv" defines it.)

    No mathematics that I am aware of has the concept of "to fit in the
    range of the type" but maybe you know different.


    It's just modulo arithmetic. Two's complement wrapping signed
    arithmetic on integers of size 2 ^ n will have the mathematical
    definition for subtraction as :

    "a - b" is defined to be the unique integer in the range -2 ^ (n-1) to 2
    ^ (n - 1) that is in the same equivalence class modulo 2 ^ n as the
    normal integer value of (a - b).


    I'm sure you are familiar with modulo arithmetic from school maths, but
    it might have been a long time ago, and might not have been handled very formally.



    or

    3. "A - B" is defined as the result of mid(int_min, A - B, int_max).

    That's an interesting one! I'm not sure what it means but it's
    definitely interesting. ;-)


    It is saturation - if the result of "A - B" with normal (infinite)
    integer arithmetic is outside the range of the type, it gets saturated
    to the limit of the type.


    or

    4. "A - B" is defined as either the result of normal integer subtraction
    if that fits within the range of the type, or an exception condition
    otherwise.

    Again, "an exception condition" is surely not mathematics.


    Surely it is.

    You would have the exception as part of the set of outputs for the function.

    I put it to you that you are thinking like an engineer and producing definitions which are suitable for engineering. That's the right thing
    to do, IMO, but it should be called engineering not mathematics.


    Nope, it is mathematics. (But the choice of the mathematical
    definitions is heavily influenced by useful engineering practices -
    there is no limit to definitions that are still solid mathematical
    definitions but are utterly useless in practice!)

    ...


    Unfortunately, many programming tutorials encourage programmers to
    simply assume that any value is 'large enough' and so will behave
    according to the normal rules of mathematics. But as everyone here
    knows, that is not always the case, as was shown in the example I saw >>>>> discussed recently of

        255 + 1

    what that results in is decided by what I would call 'engineering',
    and
    not by the normal rules of mathematics.


    Engineering is about applying the mathematical (and perhaps physical,
    chemical, etc.) laws to practical situations.  An engineer who does not >>>> understand that there is a mathematical basis for what they do is in
    the
    wrong profession.  (I certainly don't mean that they should understand >>>> the mathematics involved - but they should understand that there /is/
    mathematics involved, and that the mathematics is what justifies the
    rules and calculations they apply.)


    Well, I would say that engineering includes being aware of and
    accommodating limits - including those limits where simple mathematics
    breaks down and no longer applies. YMMV.


    Mathematics doesn't break down.  That's the point.


    I said "simple mathematics" breaks down.


    "Simple" is in the eye of the beholder. But certainly you need to be
    happy with something more than standard infinite integer arithmetic.

    In fact, it breaks down so much that even something as simple as plain subtraction is better described by an algorithm.


    Algorithms are mathematical recipes. You can't escape :-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Thu Oct 7 14:59:52 2021
    On 07/10/2021 14:02, David Brown wrote:
    On 05/10/2021 21:55, James Harris wrote:

    No mathematics that I am aware of has the concept of "to fit in the
    range of the type" but maybe you know different.


    It's just modulo arithmetic.

    Modulo arithmetic seems to be mainly defined over a range that starts
    from zero.

    To have a range between any two limits, you need to start applying offsets.

    The sort of modulo arithmetic used to give a free overflow pass to C's
    unsigned operations is much more rigid: the (inclusive) range is a power-of-two, and starts from 0.

    Which I always thought was too specific; if someone wants modulo
    behaviour in the range 1 to 100 inclusive (100+1 wraps to 1), it will
    not have language support.

    3. "A - B" is defined as the result of mid(int_min, A - B, int_max).

    That's an interesting one! I'm not sure what it means but it's
    definitely interesting. ;-)


    It is saturation - if the result of "A - B" with normal (infinite)
    integer arithmetic is outside the range of the type, it gets saturated
    to the limit of the type.

    That's a confusing way of expressing it. Presumably 'mid' refers to the
    middle argument, but it won't be the middle numerically if outside the
    range.

    Also, if actually implementing code that does this, checking whether it
    is in-range might be tricky if done /after/ you've evaluated A-B, if
    using int-sized arithmetic.

    I use an actual operator called clamp, where your example becomes:

    clamp(A-B, int.min, int.max)

    But for those limits, the calculation must done with a type of wider
    range than int. (Clamp is defined on top of min/max ops.)

    Again, "an exception condition" is surely not mathematics.


    Surely it is.

    You would have the exception as part of the set of outputs for the function.

    So, how do you get from deep inside one formula to one elsewhere, or can mathematics also define 'goto'?

    The fact is that mathematics is not a programming language, otherwise we
    would all be coding in it. Real code has a dynamic element that is
    missing from maths.

    And if source code was mathematics, then you wouldn't need to run it
    (saving the bother of writing compilers and implementations, or even
    needing to buy a computer); you'd just look at it!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to Bart on Thu Oct 7 17:09:49 2021
    On 07/10/2021 14:59, Bart wrote:
    The fact is that mathematics is not a programming language, otherwise
    we would all be coding in it. Real code has a dynamic element that is
    missing from maths.

    Have you not been reading my articles? "Dynamic elements"
    may be missing from the maths that you know, but it is present in
    lots of other maths, not least [and not only] ...

    And if source code was mathematics, then you wouldn't need to run it
    (saving the bother of writing compilers and implementations, or even
    needing to buy a computer); you'd just look at it!

    ... symbolic algebra packages and logic languages. Also
    worth reminding you that some of the earliest languages [FORmula
    TRANslatio, ALGOrithmic Language, Flowmatic, various Autocodes
    and much else] were intended to translate maths into computer
    terms. The "source code" was as close to maths as they were
    able to get given the computing resources of the period.

    You don't just "look at" mathematics, you use it to find
    [eg] the solution to differential equations, and these days that
    is commonly done from the DE itself, not from C code to implement
    some solution technique. [Of course, there is likely to be C
    code and machine code "under the hood", but what the user writes
    is mathematics.]

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Joplin

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Thu Oct 7 18:20:49 2021
    On 07/10/2021 12:11, David Brown wrote:
    On 06/10/2021 13:08, Bart wrote:

    These are somewhat unsatisfactory.

    So, the reality is bit a more involved. But it depends on what the
    purpose of your mathematical definitions are: is it just to 'look
    pretty'; or is it informal user documentation; or would it actually be
    input to some compiler generator?


    They can certainly be used in code generation, such as for code
    optimisation and simplification. They can be used in proving code correctness (which can sometimes be done automatically). They can be
    used to reason about code and algorithms, or document them, or specify them.

    Here's my specification for abs() as used in my dynamic language.

    To avoid having to invent some syntax, I chosen to use the language's
    syntax, and actually this program will run as it is (why I had to use
    'myabs' instead of 'abs'):

    function myabs(x)=
    case x.type
    when int then
    case
    when x = x.min then x
    when x >= 0 then x
    when x < 0 then -x
    esac

    when word then
    word(abs(int(x)))

    when real then
    case
    when x >= 0 then x
    else -x # includes -0
    esac

    when decimal then
    case
    when x = infinity then x
    when x = nan then x
    when x = -infinity then infinity
    when x >= 0 then x
    else -x
    esac

    else
    abort("type error")
    0
    esac
    end

    Some notes:

    * I chose to use 'x.min' instead of 'int.min', as that would that bit of
    code more reusable

    * The spec for decimal is a lot more complicated that its
    implementation, which just sets a 'neg' flag to zero (after making a
    mutable copy). Otherwise some compares won't work with infinity etc.

    * In a much older version, "abs" and "magnitude" were used
    interchangeably, and were defined for (x,y,z) vectors too, returning
    their length

    * My static language has to deal also with narrower int types, and also in-place abs:= operations. But I couldn't express such a function in
    that language; x must be a specific type.

    I could probably still use this syntax to define the spec, but I
    couldn't test it by running it here.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Andy Walker on Thu Oct 7 20:57:57 2021
    On 07/10/2021 14:27, Andy Walker wrote:
    On 07/10/2021 11:52, David Brown wrote:
    On 06/10/2021 18:15, James Harris wrote:
    On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
    [... overflow ...]

        In the early days of computing, array-bound checks and
    overflow checks were automatic unless you actively turned them
    off.  These days, you ought to be able to add [eg] use of
    uninitialised variables, use of storage after "free", following
    a null pointer, and probably others.  It used to be important
    to be able to turn them off, as they dragged in extra code and
    extra time into limited storage and run-time.  These days, such
    factors really, really don't impinge on almost all normal work.
    For all but a tiny proportion of work, time is dominated by
    disc/network transfers or waiting for the user to type/click,
    and space by large data structures rather than a bit of extra
    code -- esp when errors are detected by hardware interrupt
    rather than by user checks.
        
        Perhaps we should get back closer to the old days?
    If your program has a bug, would you rather that the program
    stops when the bug is first manifest, or that it continues
    until something more catastrophic happens?  Or, even worse,
    that it continues but gives wrong results with no indication?
    How much developer/user time is wasted dealing with malware
    that couldn't exist in a safer environment?


    I appreciate your point, but disagree somewhat.

    First off, I agree that for a lot of code, run-time speed of the code
    itself is (or should be) a minor concern. But the answer is not to have run-time checking in a language like C - the answer is not to use a
    language like C (or Ada, or Java, or C++) for such tasks. Rather,
    languages like Python or other higher level languages should be used -
    with the language choice depending on the type of task. Then there
    simply isn't a question of overflows of integers or buffers, and array
    bound errors are caught at run-time (though most opportunities for such
    errors are avoided by having proper strings, high-level structures like hashmaps and queues, etc.

    Secondly, I /don't/ want the bug to be found as soon as it manifests
    itself at run-time. I want it to be found /before/ run-time. And one
    of the things that lets C and C++ tools mark something like integer
    overflow as an error (if it can be seen at compile-time) is precisely
    the fact that it is undefined behaviour. If there is a defined
    behaviour - including throwing a run-time exception of some sort - the
    the compiler or more advanced static analysis can't stop you and say
    it's a mistake.

    Thirdly, modern C and C++ (and Ada and other language) tools /do/
    support checking at run-time. People just have to choose to use them.
    (And again, these rely on undefined behaviour.)

    I do agree that hiding the mistake and carrying on is often the worst
    option. You get that when you try to define the behaviour of everything
    in a language.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Thu Oct 7 22:10:28 2021
    On 07/10/2021 15:59, Bart wrote:
    On 07/10/2021 14:02, David Brown wrote:
    On 05/10/2021 21:55, James Harris wrote:

    No mathematics that I am aware of has the concept of "to fit in the
    range of the type" but maybe you know different.


    It's just modulo arithmetic.

    Modulo arithmetic seems to be mainly defined over a range that starts
    from zero.

    To have a range between any two limits, you need to start applying offsets.


    Modulo arithmetic is normally defined in mathematics using equivalent
    classes. If you are working with integers modulo 4 (to take a nice
    small example), ℤ/4ℤ is a set with four members that I'll call { u0, u1, u2, u3 }. The set u0 is { ..., -8, -4, 0, 4, 8, ... }, i.e., the set of
    all integers of the form 4x + 0. Similarly, u2 is all integers of the
    form 4x + 2.

    With that definition, you don't need any offsets - "u-2" is equal to
    "u2", and the set { u-2, u-1, u0, u1 } is exactly the same as { u0, u1,
    u2, u3 }. (You can also use { u8, u-7, u22, u3 } if you like.)

    It is most common, of course, and simplest to think about, if you use
    the unsigned values as identifiers rather than signed values.

    You can also imagine that any number has a "+4x" attached where "x" is
    an arbitrary integer that you can modify at will for your convenience.

    The sort of modulo arithmetic used to give a free overflow pass to C's unsigned operations is much more rigid: the (inclusive) range is a power-of-two, and starts from 0.

    Which I always thought was too specific; if someone wants modulo
    behaviour in the range 1 to 100 inclusive (100+1 wraps to 1), it will
    not have language support.


    I believe Ada has this kind of language support (I'm sure Dmitry will
    let us know). I don't know how useful it would be in practice - I
    suspect it's the kind of thing you get for free when you make generally
    useful type characteristics. If you say a type can have a range from a
    low value to a high value, and you say you can choose overflow behaviour
    to be wrapping, undefined, saturating or run-time error, then along with clearly useful types like C's unsigned and signed integers, and types
    like "0 to 99", you also get support for modulo types from 1 to 100.

    3. "A - B" is defined as the result of mid(int_min, A - B, int_max).

    That's an interesting one! I'm not sure what it means but it's
    definitely interesting. ;-)


    It is saturation - if the result of "A - B" with normal (infinite)
    integer arithmetic is outside the range of the type, it gets saturated
    to the limit of the type.

    That's a confusing way of expressing it. Presumably 'mid' refers to the middle argument, but it won't be the middle numerically if outside the
    range.

    I meant "mid" as being the middle value when ordered numerically - the
    median, if you prefer. (There would be no point in saying the middle
    argument - that would simply be "A - B".) I thought my terminology here
    would be obvious, but when two people have found it confusing or
    ambiguous, clearly that was my mistake. But hopefully you know what I
    mean now.


    Also, if actually implementing code that does this, checking whether it
    is in-range might be tricky if done /after/ you've evaluated A-B, if
    using int-sized arithmetic.


    Implementation is a different matter, as I have already said - these are specifications, not implementations. A specification can happily
    include expressions that are impossible to implement directly. I could
    specify that the constant "x" be defined as :

    x = ⎲ ∞ -i
    ⎳ i = 0 2

    (I hope my unicode / ASCII art looks okay!)

    An implementation involving calculating an infinite sum is not going to
    work. An implementation setting "x" to 2 would be fine.

    How you implement a saturated subtraction will depend on the target
    processor, the language, the tools, the preferences of the programmer.
    The point of the specification is that it tells all users of subtraction
    what it does, and it tells the implementer of subtraction what it should
    do. People using the subtraction should ignore all details of how it is implemented.

    I use an actual operator called clamp, where your example becomes:

        clamp(A-B, int.min, int.max)

    But for those limits, the calculation must done with a type of wider
    range than int. (Clamp is defined on top of min/max ops.)

    Again, "an exception condition" is surely not mathematics.


    Surely it is.

    You would have the exception as part of the set of outputs for the
    function.

    So, how do you get from deep inside one formula to one elsewhere, or can mathematics also define 'goto'?

    The fact is that mathematics is not a programming language, otherwise we would all be coding in it. Real code has a dynamic element that is
    missing from maths.

    And if source code was mathematics, then you wouldn't need to run it
    (saving the bother of writing compilers and implementations, or even
    needing to buy a computer); you'd just look at it!


    Haven't you read any Knuth? "Beware of the following code. I have not
    tested it, merely proven it to be correct." :-)

    When I was at university, the great majority of our code was written by
    hand on paper or the tutor's blackboard. We proved it worked -
    typically by deriving it logically and mathematically from initial specification through to executable source code. Most of the time I
    only ever used a computer to type up essays.

    I agree that for a lot of real-world code, such processes are seriously impractical. And the kind of programming language you use makes a big difference to how easy it is to reason mathematically about the code.
    But it is always /possible/.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to David Brown on Fri Oct 8 08:49:15 2021
    On 2021-10-07 22:10, David Brown wrote:
    On 07/10/2021 15:59, Bart wrote:

    Which I always thought was too specific; if someone wants modulo
    behaviour in the range 1 to 100 inclusive (100+1 wraps to 1), it will
    not have language support.

    I believe Ada has this kind of language support

    No, Ada has only normal mudular numbers. Though it would be easy to
    implement a user-defined numeric type with these properties as you can
    override +,-,*,/ etc.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Dmitry A. Kazakov on Fri Oct 8 13:53:34 2021
    On 08/10/2021 08:49, Dmitry A. Kazakov wrote:
    On 2021-10-07 22:10, David Brown wrote:
    On 07/10/2021 15:59, Bart wrote:

    Which I always thought was too specific; if someone wants modulo
    behaviour in the range 1 to 100 inclusive (100+1 wraps to 1), it will
    not have language support.

    I believe Ada has this kind of language support

    No, Ada has only normal mudular numbers. Though it would be easy to
    implement a user-defined numeric type with these properties as you can override +,-,*,/ etc.


    OK, so you have built-in support for, say a 0 to 99 modular type, you'd
    have to make your own class for a 1 to 100 modular type? Fair enough.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Fri Oct 8 13:35:06 2021
    On 07/10/2021 13:38, David Brown wrote:
    On 06/10/2021 17:37, James Harris wrote:
    On 06/10/2021 00:04, Bart wrote:

    ...

    I don't know why everyone seems determined to question my credentials.

    Not everyone! IMO you are correct and they are wrong.

    I for one am only questioning /some/ of Bart's credentials in relation
    to /some/ of the things he has said, not as a general point. For many
    things in this group, his credentials include experience and
    achievements well beyond anything I have done in practice. But we all
    have occasions when we mistake our own subjective opinions for objective facts, or when we have strong opinions that are not based on knowledge
    or experience.

    OT but I never understand that when discussing ideas people care about credentials. Ideas should be judged on their merits, not on the
    qualifications of the person putting them forward.

    In fact, if one wants to break new ground one really needs to invite
    ideas which have not come from established schools of thought.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Fri Oct 8 16:09:32 2021
    On 08/10/2021 14:35, James Harris wrote:
    On 07/10/2021 13:38, David Brown wrote:
    On 06/10/2021 17:37, James Harris wrote:
    On 06/10/2021 00:04, Bart wrote:

    ...

    I don't know why everyone seems determined to question my credentials.

    Not everyone! IMO you are correct and they are wrong.

    I for one am only questioning /some/ of Bart's credentials in relation
    to /some/ of the things he has said, not as a general point.  For many
    things in this group, his credentials include experience and
    achievements well beyond anything I have done in practice.  But we all
    have occasions when we mistake our own subjective opinions for objective
    facts, or when we have strong opinions that are not based on knowledge
    or experience.

    OT but I never understand that when discussing ideas people care about credentials. Ideas should be judged on their merits, not on the qualifications of the person putting them forward.

    Ideas can stand on their own merits. The relevance of discussing them,
    the effort you put into the discussion, and the weight or consideration
    you give to ideas can depend on qualifications or credentials (and I
    mean these terms in a general sense - not formal academic qualifications).

    If I am discussing a topic which I know about, and the other person
    knows little and knows that they know little, then I should be patient
    and try to give good explanations. If the other person knows a lot,
    then there is no point in my wasting my time - I can go straight to the
    point. If the other person knows only a little, but /thinks/ that they
    know a lot, then its unlikely there will be much fruitful discussion on
    the topic.

    If I am listening in to a discussion on a topic that I know little about
    (I can't think of many examples - football, perhaps :-) ) and am not in
    a position to judge the merit of two competing ideas, then the
    qualifications of the people putting them forward help you judge.

    So if you have a sore tooth, and know nothing about dentistry, you might solicit ideas and opinions from several people. A professional dentist
    might suggest you need a filling. A toothologist might suggest tying a
    live frog to your jaw for a couple of days. You don't know which is the
    best idea, so you look at the qualifications. The dentist has a
    professional qualification - clearly he is just after your money. The toothologist is quoting from Pliny the Elder - a naturalist and author
    who is still well known after two thousand years. Obviously you judge
    the toothologist as the better qualified idea.


    In fact, if one wants to break new ground one really needs to invite
    ideas which have not come from established schools of thought.


    That's fine - as long as you understand that the /vast/ majority of such
    ideas will be wrong. Being educated and/or experienced in a field is no guarantee that you will be right or have the best ideas, but it is a
    pretty good guide in practice.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Andy Walker on Fri Oct 8 16:50:21 2021
    On 07/10/2021 13:27, Andy Walker wrote:
    On 07/10/2021 11:52, David Brown wrote:
    On 06/10/2021 18:15, James Harris wrote:
    On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
    [... overflow ...]

        In the early days of computing, array-bound checks and
    overflow checks were automatic unless you actively turned them
    off.  These days, you ought to be able to add [eg] use of
    uninitialised variables, use of storage after "free", following
    a null pointer, and probably others.  It used to be important
    to be able to turn them off, as they dragged in extra code and
    extra time into limited storage and run-time.  These days, such
    factors really, really don't impinge on almost all normal work.
    For all but a tiny proportion of work, time is dominated by
    disc/network transfers or waiting for the user to type/click,
    and space by large data structures rather than a bit of extra
    code -- esp when errors are detected by hardware interrupt
    rather than by user checks.

        Perhaps we should get back closer to the old days?
    If your program has a bug, would you rather that the program
    stops when the bug is first manifest, or that it continues
    until something more catastrophic happens?  Or, even worse,
    that it continues but gives wrong results with no indication?
    How much developer/user time is wasted dealing with malware
    that couldn't exist in a safer environment?

    Absolutely! It's best to detect a bug when it first arises rather than
    when it has caused secondary errors. One of the worst cases is a bad
    pointer, P, which for an operation like C's

    *P = x

    can cause damage to some innocent bystander which has nothing whatsoever
    to do with the cause of the problem. Such a bug is evil personified! As
    well as being fiendish to track down it can cause a correct program (the innocent bystander) to generate incorrect output which is never even
    recognised as being incorrect. In computing terms it's hard to think of anything worse!

    All-in-all, the cost of detecting run-time errors is much lower than the
    cost of dealing with the consequences of not detecting them.

    Better still is to detect errors at compile time, of course. In fact, in
    terms of a product's lifecycle the earlier a problem is detected the
    better.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Fri Oct 8 16:40:12 2021
    On 07/10/2021 12:14, Bart wrote:
    On 07/10/2021 11:52, David Brown wrote:
    On 06/10/2021 18:15, James Harris wrote:
    On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
    On 2021-10-06 16:56, James Harris wrote:

    ...

    That's maths. Engineering includes how to respond to overflow (or the >>>>> potential thereof).

    No, engineering is how to *avoid* overflows.

    ...

    Fine, Dmitry. You try to write code to avoid

       A * B

    overflowing before you execute the multiply.

    ...

    Real programs may also have to end up multiplying two values that are
    runtime inputs. They may not have any constraints either.

    That was the kind of scenario I had in mind. (Dmitry's point was
    different.) It's likely infeasible given A and B (two objects of the
    same integer type) to determine whether they will overflow their type
    when multiplied. It's better simply to multiply them and detect whether overflow arose therefrom.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Fri Oct 8 16:52:06 2021
    On 07/10/2021 12:32, David Brown wrote:
    On 06/10/2021 17:26, James Harris wrote:
    On 06/10/2021 07:20, David Brown wrote:
    On 05/10/2021 21:13, James Harris wrote:
    On 05/10/2021 18:35, David Brown wrote:

    ...

          abs(x) = ⎧ x, if x >= 0
                   ⎨
                   ⎩ -x, if x < 0

    ...

    But something else stood out particularly in the context of this
    subthread. The definition of abs(x) would fail on a computer if it were >>>> using 2's complement representation and x was the most negative number. >>>> It's a classic case in point where the computer won't follow the
    accepted mathematical definition.


    These were examples of function definitions with conditionals in common
    mathematics using real numbers - they cannot be implemented directly in
    computer code.  If you want a mathematical definition of "abs" for fixed >>> size integer types in a programming language, you must adapt it to a
    different mathematical definition that is suitable for the domain you
    are using (i.e., the input and output sets are the range of your
    computer type, rather than the real numbers).  It is, however, still
    maths.  Two possibilities for n-bit two's complement signed integers
    could be :

           abs(x) = ⎧ x, if x >= 0
                    ⎨ -x, if x < 0 and x > int_min
                    ⎩ int_min, if x = int_min

    Yes. I would consider that a valid and correct definition given the
    criteria. It describes what a programmer can expect from a computer's
    abs function (again, given the criteria).

    What criteria?

    Those above: 2's complement, etc.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Fri Oct 8 16:54:33 2021
    On 08/10/2021 15:09, David Brown wrote:
    On 08/10/2021 14:35, James Harris wrote:
    On 07/10/2021 13:38, David Brown wrote:
    On 06/10/2021 17:37, James Harris wrote:
    On 06/10/2021 00:04, Bart wrote:

    ...

    I don't know why everyone seems determined to question my credentials. >>>>
    Not everyone! IMO you are correct and they are wrong.

    I for one am only questioning /some/ of Bart's credentials in relation
    to /some/ of the things he has said, not as a general point.  For many
    things in this group, his credentials include experience and
    achievements well beyond anything I have done in practice.  But we all
    have occasions when we mistake our own subjective opinions for objective >>> facts, or when we have strong opinions that are not based on knowledge
    or experience.

    OT but I never understand that when discussing ideas people care about
    credentials. Ideas should be judged on their merits, not on the
    qualifications of the person putting them forward.

    Ideas can stand on their own merits. The relevance of discussing them,
    the effort you put into the discussion, and the weight or consideration
    you give to ideas can depend on qualifications or credentials (and I
    mean these terms in a general sense - not formal academic qualifications)

    The thing about computing and languages is that anyone can have a go;
    you don't need to be an academic or a professional.

    That's been the case for decades actually, but with a higher bar in the
    past (eg. no freely downloadable software 40 years ago).

    So, when you /successfully/ do your own thing for years, and in
    isolation (I worked from home since '85), you develop an irrevant
    attitude to formal methods, and less tolerance or patience for many
    things too such as cumbersome-to-use tools.

    Now, people like you and DAK come along, your attitude is like someone
    involved in industrial-scale food production questioning the methods of
    someone cooking meals in their own kitchen.

    They may not have the relevant degrees or training or professional
    experience, and the methods would not suit mass-production, but they can
    still produce delicious food that serves the same end purpose.

    It may not be a coincidence that the small company I initially worked
    for, also designed and manufactured its own microcomputers, including
    making its own PCBs, /and/ developed its own OS for them.

    So if you have a sore tooth, and know nothing about dentistry, you might solicit ideas and opinions from several people. A professional dentist
    might suggest you need a filling.

    (Actually, I did do my own filling last year; a temporary one but which
    lasted four months.)

    In fact, if one wants to break new ground one really needs to invite
    ideas which have not come from established schools of thought.


    That's fine - as long as you understand that the /vast/ majority of such ideas will be wrong.

    It's not wrong when they work. It /is/ annoying when systems like Unix
    and its C language have so many bad influences, but which customs are
    now so widespread that people come to view those choices as the only
    sensible ones, and anything else wrong. Examples:


    Unix/C Mine**

    File systems Case sensitive Case insensitive
    Shell commands Case sensitive Case insensitive
    Source code Case sensitive Case insensitive

    Text file I/O Char-based Line-based

    Array indexing 0-based only 1-based/N-based

    Read/Print User-functions Statements

    Block delimiting {...} braces if-then-else-end etc

    For-loops for(...;...;...) for i in a..b

    Build an app gcc a.c b.c c.c.. mm a
    -o:a.exe
    make
    Cmake etc

    (** That means the OSes I've used, various DEC ones, MS etc. And my own languages which themselves were influenced by the languages I'd used on
    those.)


    Being educated and/or experienced in a field is no
    guarantee that you will be right or have the best ideas, but it is a
    pretty good guide in practice.

    I'd imagine there's a few multi-millionaire software developers who
    haven't had the right education either. (But they're more adept than me
    with other peoples software.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Fri Oct 8 17:00:52 2021
    On 08/10/2021 15:09, David Brown wrote:
    On 08/10/2021 14:35, James Harris wrote:

    ...

    If I am listening in to a discussion on a topic that I know little about
    (I can't think of many examples - football, perhaps :-) ) and am not in
    a position to judge the merit of two competing ideas, then the
    qualifications of the people putting them forward help you judge.

    I disagree big time. I can see where you are coming from but IMO such an approach is horribly limiting. I'd suggest instead to look for logic in
    each argument and to develop a feel for lines of enquiry that might lead somewhere useful. The best answer may be neither of those presented but something in one argument or the other may trigger a new way of thinking
    which results in a useful direction of travel.


    So if you have a sore tooth, and know nothing about dentistry, you might solicit ideas and opinions from several people. A professional dentist
    might suggest you need a filling. A toothologist might suggest tying a
    live frog to your jaw for a couple of days. You don't know which is the
    best idea, so you look at the qualifications. The dentist has a
    professional qualification - clearly he is just after your money. The toothologist is quoting from Pliny the Elder - a naturalist and author
    who is still well known after two thousand years. Obviously you judge
    the toothologist as the better qualified idea.

    If you want to know what's best ask a toothsayer. :-)



    In fact, if one wants to break new ground one really needs to invite
    ideas which have not come from established schools of thought.


    That's fine - as long as you understand that the /vast/ majority of such ideas will be wrong. Being educated and/or experienced in a field is no guarantee that you will be right or have the best ideas, but it is a
    pretty good guide in practice.

    The number of wrong ideas doesn't matter much. It's fairly easy to
    filter them out.

    It's *far* better to encourage free thinking (and to filter what comes
    back) than to discourage free thinking by lauding tradition.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to All on Fri Oct 8 17:33:17 2021
    On 07/10/2021 19:57, David Brown wrote:
    [I wrote:]
        In the early days of computing, array-bound checks and
    overflow checks were automatic [...].
        Perhaps we should get back closer to the old days?
    If your program has a bug, would you rather that the program
    stops when the bug is first manifest, [...]?
    I appreciate your point, but disagree somewhat.
    First off, I agree that for a lot of code, run-time speed of the code
    itself is (or should be) a minor concern. But the answer is not to have run-time checking in a language like C - the answer is not to use a
    language like C (or Ada, or Java, or C++) for such tasks. Rather,
    languages like Python or other higher level languages should be used -
    with the language choice depending on the type of task.

    Yes, but that is putting the responsibility in the wrong
    place. Programmers know C [or think they do]; no point telling
    them to use Python [which by hypothesis they don't know] instead.
    Further, almost all of the software I use was written by someone
    else; I have no control over how it was written. The best I can
    hope for, [somewhat] realistically, is that my computer will tell
    me when that software does something buggy. For example, if my
    computer has a hardware interrupt when an overflow occurs, that
    overflow cannot sneak through undetected. Computers of the '60s
    [speaking generalities] had that; in the '70s, it was replaced
    by setting overflow flags, which had to be tested for after each
    arithmetic operation. When space and time were tight, testing
    went by the board; in my experiments on the ICL 1906A, adding
    overflow checks to a program slowed it down by 32% and cost a
    significant amount of [tight] storage. By contrast, on Atlas,
    the hardware interrupt was essentially a free service to all
    programs. Guess what happened on the '6A, and even more so on
    the PDP-11 when Unix came along.

    Given suitable hardware, most of the more egregious
    errors can be caught "free"; it just requires a co-operating
    computer that controls and checks array accesses, storage
    management, pointers out of range, reading uninitialised
    storage and so on. Then it becomes more expensive to by-pass
    the checks than just to use them [and eliminate most of the
    malware that relies on exploiting their lack].

    Then there
    simply isn't a question of overflows of integers or buffers, and array
    bound errors are caught at run-time (though most opportunities for such errors are avoided by having proper strings, high-level structures like hashmaps and queues, etc.

    Again, all good advice, but counsel of perfection; most
    malware exploits people who don't have these things, but could be
    "helped" by hardware that steered them in the right direction.

    Secondly, I /don't/ want the bug to be found as soon as it manifests
    itself at run-time. I want it to be found /before/ run-time.

    Well, yes. QoI issue, for the most part.

    And one
    of the things that lets C and C++ tools mark something like integer
    overflow as an error (if it can be seen at compile-time) is precisely
    the fact that it is undefined behaviour. If there is a defined
    behaviour - including throwing a run-time exception of some sort - the
    the compiler or more advanced static analysis can't stop you and say
    it's a mistake.

    Yes, but there is a difference between software that may or
    may not warn you [another can of worms there!] about potential or
    actual bugs and hardware that /will/ cause your program to crash
    and burn if bad things happen. [What to do about nuclear power
    stations, autonomous vehicles, fighter jets and so on is then
    another matter ....]

    Thirdly, modern C and C++ (and Ada and other language) tools /do/
    support checking at run-time. People just have to choose to use them.
    (And again, these rely on undefined behaviour.)

    You've subtly left it ambiguous whether it's "people" who
    have undefined behaviour or their programs! But yes; and, again,
    you have less to worry about if the hardware insists on the checks.

    I do agree that hiding the mistake and carrying on is often the worst
    option. You get that when you try to define the behaviour of everything
    in a language.

    Indeed. But you also get it when the tools you [rightly]
    describe aren't used; IOW when slapdash programmers are let loose
    on important projects.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Chopin

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Fri Oct 8 18:17:43 2021
    On 2021-10-08 18:00, James Harris wrote:

    The number of wrong ideas doesn't matter much. It's fairly easy to
    filter them out.

    Oh, yes, see how easy it is to filter out inhuman ideas to change the
    society by force, take away our freedoms. That worked nicely in recent
    times...

    It's *far* better to encourage free thinking (and to filter what comes
    back) than to discourage free thinking by lauding tradition.

    Yes, I agree, anybody must have "a day in court," yet some level of
    mutual understanding is required. The problem is with *expressing* ideas
    in a form understandable for many. This is were qualification comes into
    play, not with the ideas. We all know "experts" unable to think out of
    the box and "amateurs" overturning the dogma.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Andy Walker on Fri Oct 8 18:27:53 2021
    On 08/10/2021 17:33, Andy Walker wrote:
    On 07/10/2021 19:57, David Brown wrote:
    [I wrote:]
         In the early days of computing, array-bound checks and
    overflow checks were automatic [...].
         Perhaps we should get back closer to the old days?
    If your program has a bug, would you rather that the program
    stops when the bug is first manifest, [...]?
    I appreciate your point, but disagree somewhat.
    First off, I agree that for a lot of code, run-time speed of the code
    itself is (or should be) a minor concern.  But the answer is not to have
    run-time checking in a language like C - the answer is not to use a
    language like C (or Ada, or Java, or C++) for such tasks.  Rather,
    languages like Python or other higher level languages should be used -
    with the language choice depending on the type of task.

        Yes, but that is putting the responsibility in the wrong
    place.  Programmers know C [or think they do];  no point telling
    them to use Python [which by hypothesis they don't know] instead.
    Further, almost all of the software I use was written by someone
    else;  I have no control over how it was written.  The best I can
    hope for, [somewhat] realistically, is that my computer will tell
    me when that software does something buggy.  For example, if my
    computer has a hardware interrupt when an overflow occurs, that
    overflow cannot sneak through undetected.  Computers of the '60s
    [speaking generalities] had that;  in the '70s, it was replaced
    by setting overflow flags, which had to be tested for after each
    arithmetic operation.  When space and time were tight, testing
    went by the board;  in my experiments on the ICL 1906A, adding
    overflow checks to a program slowed it down by 32% and cost a
    significant amount of [tight] storage.  By contrast, on Atlas,
    the hardware interrupt was essentially a free service to all
    programs.  Guess what happened on the '6A, and even more so on
    the PDP-11 when Unix came along.

        Given suitable hardware, most of the more egregious
    errors can be caught "free";  it just requires a co-operating
    computer that controls and checks array accesses, storage
    management, pointers out of range, reading uninitialised
    storage and so on.  Then it becomes more expensive to by-pass
    the checks than just to use them [and eliminate most of the
    malware that relies on exploiting their lack].

    I have an interpreter that does all sorts of runtime checks, mainly for combinations of types that are not supported, as well as array bounds
    checking, but not for overflows. (There are some 200 different checks.)

    Outside of development, it is very rare for a production program to
    trigger an error. (And I've had past versions working hours each day at
    1000 customer sites.)

    So, all those checks are really largely a waste of time.

    Perhaps this is the kind of reasoning that led to the switch from
    interrupt handling to setting flags; I don't know what rationale applied
    there. (Flags can often be /usefully/ triggered, and might have a lot
    fewer overheads than servicing an interrupt.)

    And also, why some language implementations offer runtime checking for
    debug versions, that can be disabled for production versions.

    In my intepreter the overheads are not significant so they can be kept.
    (Bounds checks are needed anyway to be able to grow arrays.)

    Regarding overflows, I temporarily fixed it up to report overflows on
    /some/ add/sub/mul i64 ops. Most programs I tried worked without
    reporting anything, until one program where overflow was intentional
    (only the bottom 64 bits were of interest).

    Checking overflow is useful in certain situations where the numbers are
    runtime data like my calculator and compiler examples, but there I think
    that check belongs in user-code, in the application, and not be built-in
    to all the language's arithmetic ops, especially without proper means of dealing with the overflow when it happens.

    Interrupts are signals are heavy ways of doing that!

    The language can help by providing some explicit way of doing it, for
    example:

    (c, overflow) := checkedadd(a, b)

    But it should not interfere with:

    c := a + b

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Fri Oct 8 20:40:59 2021
    On 07/10/2021 14:02, David Brown wrote:
    On 05/10/2021 21:55, James Harris wrote:
    On 04/10/2021 19:19, David Brown wrote:


    ...

    3. "A - B" is defined as the result of mid(int_min, A - B, int_max).

    That's an interesting one! I'm not sure what it means but it's
    definitely interesting. ;-)


    It is saturation - if the result of "A - B" with normal (infinite)
    integer arithmetic is outside the range of the type, it gets saturated
    to the limit of the type.

    OK, though if it's saturation why is it called mid?


    ...


    Algorithms are mathematical recipes. You can't escape :-)

    Algorithms are more general than that. You could have an algorithm which included no mathematics at all but it would still be an algorithm.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Fri Oct 8 22:18:26 2021
    On 2021-10-08 19:27, Bart wrote:

    So, all those checks are really largely a waste of time.

    Wrong, unless you can prove otherwise.

    The language can help by providing some explicit way of doing it, for example:

       (c, overflow) := checkedadd(a, b)

    But it should not interfere with:

       c := a + b

    The modern method of eliminating checks is proving that they do not
    fail, which is pretty easy if you have the type system supporting
    constraints and/or pre-/postconditions. E.g. with the later one simply
    writes

    c := a + b
    ensure: c = a + b

    and let the compiler hit you with error that this cannot be proven. Then
    you expand that with a stronger precondition, like

    require: a in 0..max_int / 2
    c := a + b
    ensure: c = a + b

    Now the compiler need not insert any checks.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Fri Oct 8 22:26:17 2021
    On 2021-10-08 21:40, James Harris wrote:
    On 07/10/2021 14:02, David Brown wrote:
    On 05/10/2021 21:55, James Harris wrote:
    On 04/10/2021 19:19, David Brown wrote:


    ...

    3. "A - B" is defined as the result of mid(int_min, A - B, int_max).

    That's an interesting one! I'm not sure what it means but it's
    definitely interesting. ;-)


    It is saturation - if the result of "A - B" with normal (infinite)
    integer arithmetic is outside the range of the type, it gets saturated
    to the limit of the type.

    OK, though if it's saturation why is it called mid?

    Because int is in [int_min, int_max]. You could rewrite that using max
    and min

    max (int_min, min (int_max, A - B))

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to Bart on Fri Oct 8 23:55:20 2021
    On 08/10/2021 18:27, Bart wrote:
    I have an interpreter that does all sorts of runtime checks, [...].
    Outside of development, it is very rare for a production program to
    trigger an error. (And I've had past versions working hours each day
    at 1000 customer sites.)

    Key phrase: "outside of development". In that case, there
    ideally ought to be /no/ errors triggered. But, of course, that
    depends on having perfect development. You may be a near-perfect
    developer; the evidence from the real world is that many are not,
    esp given the pressures to get software out of the door ASAP, and
    hang the final testing when deadlines loom.

    So, all those checks are really largely a waste of time.

    Yes. In the same way that having seat belts in my car is
    "largely" a waste of time. In fact, so far it has been a complete
    and utter waste of time. Likewise, insuring my house; and many
    other precautionary activities. In fact, no-one has ever broken
    into my computer or stolen my credit-card details [AFAIK!], so it
    is a waste of time having passwords and PINs. Or perhaps not.

    Much malware derives from exploiting things that ought to
    be unexploitable. In the early days of Unix, in those heady days
    before spam existed, before worms, before "black hats" generally,
    a lot of assumptions were made /and then not tested/, because we
    programmed around what was reasonable rather than possible. For
    example, buffers were often [eg] 512 bytes, because "no-one would
    ever want a filename longer than that". Until someone found that
    if you supplied such a filename, you could overwrite the next
    area of storage, which happened to contain [eg] some passwords.
    /You/ may be confident that all such possibilities are tested for
    in /your/ programs, or that no-one in your 1000 installations has
    the will/ability to exploit the exceptions, but do you have the
    same confidence in whoever supplied your browser, or your mail
    agent, or your banking software? I don't; and my confidence is
    not enhanced by the frequent updates to my system that arrive with
    bug fixes in precisely the sorts of area that should have been
    found [but clearly weren't] by testing or at least by run-time
    checks -- index out of bounds, use after "free", uninitialised
    variables, null-pointer dereference, and so on.

    Perhaps this is the kind of reasoning that led to the switch from
    interrupt handling to setting flags; I don't know what rationale
    applied there.

    I suspect it was simply the cost of the hardware. To
    the manufacturer, software checks are free, and you can do well
    in benchmarks by omitting them.

    (Flags can often be /usefully/ triggered, and might
    have a lot fewer overheads than servicing an interrupt.)

    Then you need mechanisms in the machine code for that.
    But you should have to decide that explicitly, not simply by not
    bothering to check.

    And also, why some language implementations offer runtime checking
    for debug versions, that can be disabled for production versions.

    Dijkstra: "It's like using water-wings while swimming in
    the paddling pool, but discarding them when you swim out to sea"
    [or words to that effect]. Or, I suppose and more realistically,
    like wearing seatbelts while getting your car out of the drive,
    but discarding them when on the motorway.

    [...]
    Checking overflow is useful in certain situations where the numbers
    are runtime data like my calculator and compiler examples, but there
    I think that check belongs in user-code, in the application, and not
    be built-in to all the language's arithmetic ops, especially without
    proper means of dealing with the overflow when it happens.

    But it shouldn't happen! If there is a genuine need for the
    sort of thing you mentioned [getting the bottom 64 bits of a 64-bit
    by 64-bit multiply], then it should be met by suitable instructions
    at the machine-code level, not by switching off checks. [Eg, IIRC,
    Atlas had a double-length accumulator; normally a f-p multiply
    returned the top half after normalisation, but you could ask for
    an un-normalised multiply and then read out the bottom half.]

    Interrupts are signals are heavy ways of doing that!

    An interrupt is, in normal use, free in terms of the code
    and of the time taken, /unless/ it is triggered, which /normally/
    means that your program has just done something rather bad.

    The language can help by providing some explicit way of doing it, for example:
    (c, overflow) := checkedadd(a, b)

    What normal person is going to write that? After all,
    /my/ code is bug-free, so that's a waste of time and space as
    well as unnecessary typing. [Ha!]

    But it should not interfere with:
    c := a + b

    Atlas wouldn't have interfered with that, unless "c"
    differed from "a+b" after it. In that case, do you not want
    to know?

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Chopin

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Andy Walker on Sat Oct 9 00:52:26 2021
    On 08/10/2021 23:55, Andy Walker wrote:
    On 08/10/2021 18:27, Bart wrote:
    I have an interpreter that does all sorts of runtime checks, [...].
    Outside of development, it is very rare for a production program to
    trigger an error. (And I've had past versions working hours each day
    at 1000 customer sites.)

        Key phrase:  "outside of development".  In that case, there ideally ought to be /no/ errors triggered.  But, of course, that
    depends on having perfect development.  You may be a near-perfect developer;  the evidence from the real world is that many are not,
    esp given the pressures to get software out of the door ASAP, and
    hang the final testing when deadlines loom.

    So, all those checks are really largely a waste of time.

        Yes.  In the same way that having seat belts in my car is
    "largely" a waste of time.  In fact, so far it has been a complete
    and utter waste of time.  Likewise, insuring my house;  and many
    other precautionary activities.  In fact, no-one has ever broken
    into my computer or stolen my credit-card details [AFAIK!], so it
    is a waste of time having passwords and PINs.  Or perhaps not.

    Not a good analogy. The probabilities are very different with a huge
    amount of environmental factors that are different on each journey, or
    each day your house/cards are at risk.

    Running the same code each time should give the expected results
    provided consideration has been given to kinds of inputs expected. If
    the inputs are likely to be wild, then use the explicit checking I show
    below.


        Much malware derives from exploiting things that ought to
    be unexploitable.

    If the idea is to protect against malicious attacks, then detecting that
    a number is bigger than approx 9000000000000000000 is not going to help
    much, if the number is expected to be no bigger than 90!

    So, then what?


    Checking overflow is useful in certain situations where the numbers
    are runtime data like my calculator and compiler examples, but there
    I think that check belongs in user-code, in the application, and not
    be built-in to all the language's arithmetic ops, especially without
    proper means of dealing with the overflow when it happens.

        But it shouldn't happen!  If there is a genuine need for the
    sort of thing you mentioned [getting the bottom 64 bits of a 64-bit
    by 64-bit multiply], then it should be met by suitable instructions
    at the machine-code level, not by switching off checks.  [Eg, IIRC,
    Atlas had a double-length accumulator;  normally a f-p multiply
    returned the top half after normalisation, but you could ask for
    an un-normalised multiply and then read out the bottom half.]

    Interrupts are signals are heavy ways of doing that!

        An interrupt is, in normal use, free in terms of the code
    and of the time taken, /unless/ it is triggered, which /normally/
    means that your program has just done something rather bad.

    The language can help by providing some explicit way of doing it, for
    example:
      (c, overflow) := checkedadd(a, b)

        What normal person is going to write that?  After all,
    /my/ code is bug-free, so that's a waste of time and space as
    well as unnecessary typing.  [Ha!]

    You /don't/ write that in normal code. Only when inputs are unknown.

    I can write such a function now with user code:

    proc start=
    int x, y, z, overflow

    x := int.max
    y := 1

    (z, overflow) := checkedadd(x, y)

    println =z
    println =overflow
    end

    function checkedadd(int a,b)int,int =
    assem
    mov D0, [a]
    mov D1, 0
    mov D2, 1
    add D0, [b]
    cmovo D1, D2
    end
    end

    Output is:

    Z= -9223372036854775808
    OVERFLOW= 1

    But, this uses inline ASM that I prefer not to use. I could try using
    i128 arithmetic, but that is unwieldly. I'd prefer language support not
    to do it for regular arithmetic, but for special checked arithmetic like
    this.

    (If I apply that to the bit in my compiler that reduces constant
    expressions, then for int.max+1, I now get this message:

    "Type Error: Overflow in constant reduction"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Sat Oct 9 12:43:24 2021
    On 08/10/2021 18:00, James Harris wrote:
    On 08/10/2021 15:09, David Brown wrote:
    On 08/10/2021 14:35, James Harris wrote:

    ...

    If I am listening in to a discussion on a topic that I know little about
    (I can't think of many examples - football, perhaps :-) ) and am not in
    a position to judge the merit of two competing ideas, then the
    qualifications of the people putting them forward help you judge.

    I disagree big time. I can see where you are coming from but IMO such an approach is horribly limiting. I'd suggest instead to look for logic in
    each argument and to develop a feel for lines of enquiry that might lead somewhere useful. The best answer may be neither of those presented but something in one argument or the other may trigger a new way of thinking which results in a useful direction of travel.


    Taking /inspiration/ from different sources is fine. Judging ideas
    without bias from their source is fine - /if/ you are in a position to
    judge the ideas properly. If not, then the source of the idea is an aid
    to figuring out what might be the better bet.


    So if you have a sore tooth, and know nothing about dentistry, you might
    solicit ideas and opinions from several people.  A professional dentist
    might suggest you need a filling.  A toothologist might suggest tying a
    live frog to your jaw for a couple of days.  You don't know which is the
    best idea, so you look at the qualifications.  The dentist has a
    professional qualification - clearly he is just after your money.  The
    toothologist is quoting from Pliny the Elder - a naturalist and author
    who is still well known after two thousand years.  Obviously you judge
    the toothologist as the better qualified idea.

    If you want to know what's best ask a toothsayer. :-)



    In fact, if one wants to break new ground one really needs to invite
    ideas which have not come from established schools of thought.


    That's fine - as long as you understand that the /vast/ majority of such
    ideas will be wrong.  Being educated and/or experienced in a field is no
    guarantee that you will be right or have the best ideas, but it is a
    pretty good guide in practice.

    The number of wrong ideas doesn't matter much. It's fairly easy to
    filter them out.


    You might think that. Have a look at the world around you, and see the
    number of people who think magnetic bracelets improve their blood
    circulation, or organic food is healthier, or that their favourite state
    leader or candidate really cares about them, or that you can't get
    pointer or memory errors if you switch to Rust. Humans are extremely
    poor at filtering out bad ideas, and sometimes seem to be most
    enthusiastic about the daftest ideas.

    It's *far* better to encourage free thinking (and to filter what comes
    back) than to discourage free thinking by lauding tradition.


    Encouraging free thinking is great - no one is suggesting otherwise. It
    is what you do /after/ the free thinking, and what you do with the
    ideas, that matters.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Sat Oct 9 12:30:25 2021
    On 08/10/2021 17:52, James Harris wrote:
    On 07/10/2021 12:32, David Brown wrote:
    On 06/10/2021 17:26, James Harris wrote:
    On 06/10/2021 07:20, David Brown wrote:
    On 05/10/2021 21:13, James Harris wrote:
    On 05/10/2021 18:35, David Brown wrote:

    ...

           abs(x) = ⎧ x, if x >= 0
                    ⎨
                    ⎩ -x, if x < 0

    ...

    But something else stood out particularly in the context of this
    subthread. The definition of abs(x) would fail on a computer if it
    were
    using 2's complement representation and x was the most negative
    number.
    It's a classic case in point where the computer won't follow the
    accepted mathematical definition.


    These were examples of function definitions with conditionals in common >>>> mathematics using real numbers - they cannot be implemented directly in >>>> computer code.  If you want a mathematical definition of "abs" for
    fixed
    size integer types in a programming language, you must adapt it to a
    different mathematical definition that is suitable for the domain you
    are using (i.e., the input and output sets are the range of your
    computer type, rather than the real numbers).  It is, however, still
    maths.  Two possibilities for n-bit two's complement signed integers
    could be :

            abs(x) = ⎧ x, if x >= 0
                     ⎨ -x, if x < 0 and x > int_min
                     ⎩ int_min, if x = int_min

    Yes. I would consider that a valid and correct definition given the
    criteria. It describes what a programmer can expect from a computer's
    abs function (again, given the criteria).

    What criteria?

    Those above: 2's complement, etc.


    Again - you are mixing implementation and specification. The definition
    is one possible valid choice - as is the definition that left
    "abs(int_min)" undefined. Both are entirely appropriate choices for a
    two's complement fixed-size integer abs function. If you want to add
    "must be a complete function, not a partial one" as a criteria, you have
    to add it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Sat Oct 9 12:46:20 2021
    On 08/10/2021 21:40, James Harris wrote:
    On 07/10/2021 14:02, David Brown wrote:
    On 05/10/2021 21:55, James Harris wrote:
    On 04/10/2021 19:19, David Brown wrote:


    ...

    3. "A - B" is defined as the result of mid(int_min, A - B, int_max).

    That's an interesting one! I'm not sure what it means but it's
    definitely interesting. ;-)


    It is saturation - if the result of "A - B" with normal (infinite)
    integer arithmetic is outside the range of the type, it gets saturated
    to the limit of the type.

    OK, though if it's saturation why is it called mid?


    Because it is the mid-point value of these three points.

    There are many other names that could be used here - saturation,
    clamping, limiting, truncating, median.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to All on Sat Oct 9 12:25:23 2021
    On 08/10/2021 18:33, Andy Walker wrote:

    (You've made some good and interesting points - I'm snipping just to
    reduce the post size.)


        Indeed.  But you also get it when the tools you [rightly]
    describe aren't used;  IOW when slapdash programmers are let loose
    on important projects.


    This is the key point - program quality and bugs is a people problem,
    more than a language or implementation problem. If you automate
    run-time checks in C, poor programmers will get lazier about checking themselves and find new ways to make bad code. If they move to
    languages that are higher level and avoid the risks of, say, buffer
    overflows, they will be more productive and generate more bad code.

    I don't know the answer to this one. But I think "add automatic safety
    checks to the language" is not it - that kind of thing reduces the
    efficiency of the code and makes life harder for the good programmers,
    but will not really make a significant difference for the poor
    programmers.

    It's fine to improve methods (whether in the language definition, the
    compiler, related tools, development strategies, etc.) aimed at reducing
    the risk of accidental mistakes or helping spot them faster - no
    programmer is infallible. But the ignorant, incompetent, lazy or
    wilfully stubborn programmers won't use these anyway.

    What we need is a linter for the programmers, not the programs. Then it
    is possible to deal with the real problem through courses, supervising,
    closer cooperation, changes to development strategies (like code
    reviews), finding new tasks or new languages that suit them better, etc.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to David Brown on Sat Oct 9 13:18:40 2021
    On 2021-10-09 12:43, David Brown wrote:

    You might think that. Have a look at the world around you, and see the number of people who think magnetic bracelets improve their blood circulation, or organic food is healthier, or that their favourite state leader or candidate really cares about them, or that you can't get
    pointer or memory errors if you switch to Rust.

    OT

    Central and East European medicine beginning with 30's invented a huge
    number of physiotherapeutic medical procedures like electrophoresis,
    exposing blood to the UV light, ultrasound warming etc. Magnets is an accessible and inexpensive remnant of that glorious epoch! (:-))

    BTW, German health insurances still pay for some of these, and some do
    for homeopathy as well.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Dmitry A. Kazakov on Sat Oct 9 15:21:04 2021
    On 09/10/2021 13:18, Dmitry A. Kazakov wrote:
    On 2021-10-09 12:43, David Brown wrote:

    You might think that.  Have a look at the world around you, and see the
    number of people who think magnetic bracelets improve their blood
    circulation, or organic food is healthier, or that their favourite state
    leader or candidate really cares about them, or that you can't get
    pointer or memory errors if you switch to Rust.

    OT

    Central and East European medicine beginning with 30's invented a huge
    number of physiotherapeutic medical procedures like electrophoresis,
    exposing blood to the UV light, ultrasound warming etc. Magnets is an accessible and inexpensive remnant of that glorious epoch! (:-))


    Nah - modern magnetic bracelets are based on the logic that your blood
    contains iron, and iron is magnetic, so magnetic bracelets are good for
    your blood flow. It's that simple - and that stupid.

    BTW, German health insurances still pay for some of these, and some do
    for homeopathy as well.


    Yes, many health services provide support for some pretty questionable "alternative medicines". Most that are supported are harmless, and the
    placebo effect and simply the feeling that someone is listening to you
    and helping you can definitely be beneficial, and if it can relax you
    and reduce stress, that's good. They don't support the crazy dangerous
    stuff, like so-called "Miracle Mineral Supplement" (a.k.a., beach).

    Homeopathy has its roots in Germany, I believe, and it used to be a very successful treatment. But that's because the homeopathic hospitals were
    kept clean, you were give good food, rest, baths, and generally looked
    after until you got better (or died anyway). A small glass of water or
    a sugar pill makes not the slightest difference either way. However,
    the "conventional medicine" of the time was dirty, crowded hospitals
    full of rampant infection and unwashed patients and doctors, where the
    main treatments involved blood-letting and medicine made from mercury
    and other poisons.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to David Brown on Sat Oct 9 19:08:47 2021
    On 2021-10-09 15:21, David Brown wrote:
    On 09/10/2021 13:18, Dmitry A. Kazakov wrote:
    On 2021-10-09 12:43, David Brown wrote:

    You might think that.  Have a look at the world around you, and see the >>> number of people who think magnetic bracelets improve their blood
    circulation, or organic food is healthier, or that their favourite state >>> leader or candidate really cares about them, or that you can't get
    pointer or memory errors if you switch to Rust.

    OT

    Central and East European medicine beginning with 30's invented a huge
    number of physiotherapeutic medical procedures like electrophoresis,
    exposing blood to the UV light, ultrasound warming etc. Magnets is an
    accessible and inexpensive remnant of that glorious epoch! (:-))


    Nah - modern magnetic bracelets are based on the logic that your blood contains iron, and iron is magnetic, so magnetic bracelets are good for
    your blood flow.

    You think that other physiotherapeutic methods are better justified? 80%
    of medicine is try and error stuff without knowing anything about the underlying processes. In many cases it is not possible to do
    statistically meaningful experiments. E.g. in the case of magnets. You
    simply do not know what to look after, what is the test group, if you
    wanted to see for any effects.

    Yes, an educated guess would be, no effect, but this is only a guess.
    So, do not judge people too harsh.

    It's that simple - and that stupid.

    Exposing blood to UV light is supposed in order to boost the immunity
    system, how is that more clever/stupid than exposing it to the magnetic
    field?

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to David Brown on Sun Oct 10 00:03:54 2021
    On 09/10/2021 11:25, David Brown wrote:
    On 08/10/2021 18:33, Andy Walker wrote:
    [...]
        Indeed.  But you also get it when the tools you [rightly]
    describe aren't used;  IOW when slapdash programmers are let loose
    on important projects.
    This is the key point - program quality and bugs is a people problem,
    more than a language or implementation problem. [...]

    True. But ...

    I don't know the answer to this one. But I think "add automatic safety checks to the language" is not it - that kind of thing reduces the
    efficiency of the code and makes life harder for the good programmers,
    but will not really make a significant difference for the poor
    programmers.

    ... I was actually advocating /hardware/ checks. In this
    case, there is no code needed or added to carry out the checks, so
    it doesn't reduce efficiency. Instead, you would have to add code
    to /evade/ the checks!

    [...]
    What we need is a linter for the programmers, not the programs.

    Now there is another good idea!

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Peerson

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to Bart on Sat Oct 9 23:44:25 2021
    On 09/10/2021 00:52, Bart wrote:
    So, all those checks are really largely a waste of time.
         Yes.  In the same way that having seat belts in my car is
    "largely" a waste of time.  In fact, so far it has been a complete
    and utter waste of time.  Likewise, insuring my house;  and many
    other precautionary activities.  In fact, no-one has ever broken
    into my computer or stolen my credit-card details [AFAIK!], so it
    is a waste of time having passwords and PINs.  Or perhaps not.
    Not a good analogy. The probabilities are very different with a huge
    amount of environmental factors that are different on each journey,
    or each day your house/cards are at risk.
    Running the same code each time should give the expected results
    provided consideration has been given to kinds of inputs expected. If
    the inputs are likely to be wild, then use the explicit checking I
    show below.

    As so often, you are telling us what /you/ could, might or
    do do in your own code. That tells us nothing about what checking
    is done by [eg] Firefox or Gcc or ...; and life is too short for
    each of us to go scrabbling to find and examine the source code for
    all the software on our computers.

    I've forgotten the details, but ISTR an experiment where
    "sliced bread" was fed to a wide variety of standard software, and
    most [70%?] fell over. That's not Good.

         Much malware derives from exploiting things that ought to
    be unexploitable.
    If the idea is to protect against malicious attacks, then detecting
    that a number is bigger than approx 9000000000000000000 is not going
    to help much, if the number is expected to be no bigger than 90!
    So, then what?

    Then at least the huge number is not going to cause any
    damage. But integer overflow is not the primary vector of malware;
    more usual is a buffer overflow, or a use after free, or exploiting
    a race condition. Those, in particular, /could/ be substantially
    mitigated by suitable hardware support, even if programmers forgot
    to check, and even if compilers didn't bother to add checks.

    The language can help by providing some explicit way of doing it, for
    example:
      (c, overflow) := checkedadd(a, b)
         What normal person is going to write that?  After all,
    /my/ code is bug-free, so that's a waste of time and space as
    well as unnecessary typing.  [Ha!]
    You /don't/ write that in normal code. Only when inputs are unknown.

    But it is usual for inputs to be unknown. That's why they
    are inputs! Again, /you/ may write careful checks when "inputs
    are unknown", but you can't make the authors of mail agents, word
    processors, search engines, ... do the same. Hardware support
    /could/, in many cases, enforce, or at least encourage, them, and
    would be /much/ cheaper [in terms of efficiency and reliability]
    than language support.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Peerson

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Dmitry A. Kazakov on Sun Oct 10 13:02:50 2021
    On 09/10/2021 19:08, Dmitry A. Kazakov wrote:
    On 2021-10-09 15:21, David Brown wrote:
    On 09/10/2021 13:18, Dmitry A. Kazakov wrote:
    On 2021-10-09 12:43, David Brown wrote:

    You might think that.  Have a look at the world around you, and see the >>>> number of people who think magnetic bracelets improve their blood
    circulation, or organic food is healthier, or that their favourite
    state
    leader or candidate really cares about them, or that you can't get
    pointer or memory errors if you switch to Rust.

    OT

    Central and East European medicine beginning with 30's invented a huge
    number of physiotherapeutic medical procedures like electrophoresis,
    exposing blood to the UV light, ultrasound warming etc. Magnets is an
    accessible and inexpensive remnant of that glorious epoch! (:-))


    Nah - modern magnetic bracelets are based on the logic that your blood
    contains iron, and iron is magnetic, so magnetic bracelets are good for
    your blood flow.

    You think that other physiotherapeutic methods are better justified?

    Usually not, no.

    80%
    of medicine is try and error stuff without knowing anything about the underlying processes. In many cases it is not possible to do
    statistically meaningful experiments. E.g. in the case of magnets. You
    simply do not know what to look after, what is the test group, if you
    wanted to see for any effects.

    Yes, an educated guess would be, no effect, but this is only a guess.
    So, do not judge people too harsh.


    You don't need to guess. You can measure, you can do research, you can
    do statistics, you can do experiments. Sometimes stuff is hard - there
    are /many/ interactions at play in the health of a person. And while double-blind tests can be done in some cases, other times they are
    impossible, impractical or unethical. Very rarely, however, is anything
    a complete guess once researchers have looked into it (though some
    treatments might start out as guesses or trial-and-error).

    We know magnetic bracelets do nothing, beyond the psychological effects
    (which /are/ real). The same applies to homeopathy, crystal healing,
    virtually all Chinese "medicine" (the bits that work are known as /real/ medicine), and so on.

    (We didn't always know this - science has come a long way both in terms
    of current accumulated knowledge, and in the development of scientific
    methods. And of course it is not complete in any sense - there's lots
    still to discover.)

    It's that simple - and that stupid.

    Exposing blood to UV light is supposed in order to boost the immunity
    system, how is that more clever/stupid than exposing it to the magnetic field?


    It isn't. It is equally useless.

    Here's a clue - anything that claims to "boost your immune system",
    doesn't work. /Nothing/ will boost it if it is working properly, though
    there are plenty of ways to /weaken/ your immune system (nutritional deficiencies, stress, cold, infections, etc.). And it's a good thing
    that no "immune system booster" works, because you don't want it boosted
    - that would be a guarantee of anaphylactic shock, autoimmune diseases, allergies, and many other unpleasantnesses.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Andy Walker on Sun Oct 10 11:36:24 2021
    On 09/10/2021 23:44, Andy Walker wrote:
    On 09/10/2021 00:52, Bart wrote:
    So, all those checks are really largely a waste of time.
         Yes.  In the same way that having seat belts in my car is
    "largely" a waste of time.  In fact, so far it has been a complete
    and utter waste of time.  Likewise, insuring my house;  and many
    other precautionary activities.  In fact, no-one has ever broken
    into my computer or stolen my credit-card details [AFAIK!], so it
    is a waste of time having passwords and PINs.  Or perhaps not.
    Not a good analogy. The probabilities are very different with a huge
    amount of environmental factors that are different on each journey,
    or each day your house/cards are at risk.
    Running the same code each time should give the expected results
    provided consideration has been given to kinds of inputs expected. If
    the inputs are likely to be wild, then use the explicit checking I
    show below.

        As so often, you are telling us what /you/ could, might or
    do do in your own code.  That tells us nothing about what checking
    is done by [eg] Firefox or Gcc or ...;  and life is too short for
    each of us to go scrabbling to find and examine the source code for
    all the software on our computers.

    The newsgroup is about languages (and language design I guess). Who
    knows what languages are used in those big applications.

    Those big apps have endless problems which IMV are cause by being so
    large, complex and multi-layered.

    (Eg. both Firefox and Opera get so clogged up after a while that I have
    to shut them down - which can take a while too - and restart them.
    Firefox had a wonderful habit where if it encountered a site that caused
    it to go wrong or to hang or take 100% cpu, requiring a restart, then it
    would reload those exact same web pages!

    Thunderbird that I use is another example. It's behaving itself today,
    but sometimes there is several seconds' lag between typing anything, and
    it appearing on the screen. That's on top of some weird and
    unpredictable editing bugs.)

    Writing simple software or keeping it small enough to keep on top of is
    another subject.

    You /don't/ write that in normal code. Only when inputs are unknown.

        But it is usual for inputs to be unknown.  That's why they
    are inputs!  Again, /you/ may write careful checks when "inputs
    are unknown", but you can't make the authors of mail agents, word
    processors, search engines, ... do the same.  Hardware support
    /could/, in many cases, enforce, or at least encourage, them, and
    would be /much/ cheaper [in terms of efficiency and reliability]
    than language support.

    You can get unknown numeric values from free-format text input. But they
    can also come from fixed-width text formats, or from binary formats,
    which cannot be outside the expected range.

    The first problem with unknown values is to see if they are out of range
    of the type that will represent them, which is not too hard, but it is something that needs to be done in software.

    You really don't want the loop that turns "12345" into 12345 to cause a hardware interrupt or to require the language to create some exception
    when doing a*10+c. (Would you need language assistance also when
    encountering "123?5"? This is just lexing.)

    It's once those values have been loaded that the problems start. They
    might be within range of the the type, but might not be valid for the application (eg. they might need to be even numbers). And then there are
    the calculations that might be done.

    The results might be well within range of the type again, but wrong for
    this application.

    So you see, dealing with values and results out of range of the type, especially using 64-bits, is just the tiny tip of the iceberg.

    If you do the rest of the job properly, then checking those machine
    limits becomes less important.

    Or to put it another way, I don't want to invest too much effort and
    sacrifice too much efficiency for that tiny benefit.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to David Brown on Sun Oct 10 14:18:13 2021
    On 2021-10-10 13:02, David Brown wrote:
    On 09/10/2021 19:08, Dmitry A. Kazakov wrote:

    Yes, an educated guess would be, no effect, but this is only a guess.
    So, do not judge people too harsh.

    You don't need to guess. You can measure, you can do research, you can
    do statistics, you can do experiments.

    Well, the point is that you cannot do any of these in a scientifically meaningful way for such broad category of effects and diverse samples.

    Here's a clue - anything that claims to "boost your immune system",
    doesn't work. /Nothing/ will boost it if it is working properly, though there are plenty of ways to /weaken/ your immune system (nutritional deficiencies, stress, cold, infections, etc.).

    Once certainly can influence the immune system, e.g. by using immunosuppressants and vaccinations.

    And it's a good thing
    that no "immune system booster" works, because you don't want it boosted
    - that would be a guarantee of anaphylactic shock, autoimmune diseases, allergies, and many other unpleasantnesses.

    No, these are malfunctions of the immune system. "Boosting" means faster
    and stronger immune response without false positives. Wear larger,
    green, CO2 neutral magnets! (:-))

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Sun Oct 10 15:10:40 2021
    On 09/10/2021 11:30, David Brown wrote:
    On 08/10/2021 17:52, James Harris wrote:
    On 07/10/2021 12:32, David Brown wrote:
    On 06/10/2021 17:26, James Harris wrote:
    On 06/10/2021 07:20, David Brown wrote:
    On 05/10/2021 21:13, James Harris wrote:
    On 05/10/2021 18:35, David Brown wrote:

    ...

           abs(x) = ⎧ x, if x >= 0
                    ⎨
                    ⎩ -x, if x < 0

    ...

    But something else stood out particularly in the context of this
    subthread. The definition of abs(x) would fail on a computer if it >>>>>> were
    using 2's complement representation and x was the most negative
    number.
    It's a classic case in point where the computer won't follow the
    accepted mathematical definition.


    These were examples of function definitions with conditionals in common >>>>> mathematics using real numbers - they cannot be implemented directly in >>>>> computer code.  If you want a mathematical definition of "abs" for
    fixed
    size integer types in a programming language, you must adapt it to a >>>>> different mathematical definition that is suitable for the domain you >>>>> are using (i.e., the input and output sets are the range of your
    computer type, rather than the real numbers).  It is, however, still >>>>> maths.  Two possibilities for n-bit two's complement signed integers >>>>> could be :

            abs(x) = ⎧ x, if x >= 0
                     ⎨ -x, if x < 0 and x > int_min
                     ⎩ int_min, if x = int_min

    Yes. I would consider that a valid and correct definition given the
    criteria. It describes what a programmer can expect from a computer's
    abs function (again, given the criteria).

    What criteria?

    Those above: 2's complement, etc.


    Again - you are mixing implementation and specification.

    On the contrary, part of the specification of the problem, at least as I
    had it in mind, was to define an abs which a computer might implement. Therefore the representation was an essential part - even if only as a
    model. Someone could pick a representation and define how abs would work
    on that representation - which is what you seemed to have done, above.

    The point being that the /mathematical/ |x| is not what most computers implement. Because most computers use 2's complement representation they
    cannot do so.

    But we've discussed this somewhat to death. If you still believe
    computers implement mathematics consider defining

    sin(x)

    as a computer might implement it. As you know, the result of that
    function is necessarily not the sine of x but an approximation thereof.
    And, yes, you could create a mathematical specification for what a
    computer would do in all kinds of situation but it would still not be
    the mathematical sine operation.

    As I've said more than once, I don't think you and I agree on the
    essentials. This is only about what one defined as 'mathematics' as
    opposed, I would argue, to 'engineering'. As with the integer cases I
    would call accounting for the sin(x) result's accuracy /engineering/
    just as I would a loss of info at the top end; they are both due to
    limitations imposed by the representation.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Sun Oct 10 17:01:23 2021
    On 10/10/2021 16:10, James Harris wrote:
    On 09/10/2021 11:30, David Brown wrote:
    On 08/10/2021 17:52, James Harris wrote:
    On 07/10/2021 12:32, David Brown wrote:
    On 06/10/2021 17:26, James Harris wrote:
    On 06/10/2021 07:20, David Brown wrote:
    On 05/10/2021 21:13, James Harris wrote:
    On 05/10/2021 18:35, David Brown wrote:

    ...

            abs(x) = ⎧ x, if x >= 0
                     ⎨
                     ⎩ -x, if x < 0

    ...

    But something else stood out particularly in the context of this >>>>>>> subthread. The definition of abs(x) would fail on a computer if it >>>>>>> were
    using 2's complement representation and x was the most negative
    number.
    It's a classic case in point where the computer won't follow the >>>>>>> accepted mathematical definition.


    These were examples of function definitions with conditionals in
    common
    mathematics using real numbers - they cannot be implemented
    directly in
    computer code.  If you want a mathematical definition of "abs" for >>>>>> fixed
    size integer types in a programming language, you must adapt it to a >>>>>> different mathematical definition that is suitable for the domain you >>>>>> are using (i.e., the input and output sets are the range of your
    computer type, rather than the real numbers).  It is, however, still >>>>>> maths.  Two possibilities for n-bit two's complement signed integers >>>>>> could be :

             abs(x) = ⎧ x, if x >= 0
                      ⎨ -x, if x < 0 and x > int_min >>>>>>                   ⎩ int_min, if x = int_min

    Yes. I would consider that a valid and correct definition given the
    criteria. It describes what a programmer can expect from a computer's >>>>> abs function (again, given the criteria).

    What criteria?

    Those above: 2's complement, etc.


    Again - you are mixing implementation and specification.

    On the contrary, part of the specification of the problem, at least as I
    had it in mind, was to define an abs which a computer might implement.

    The point of having a mathematical specification is that there is no
    such thing as what you had "in mind" - it is /exact/. You don't get to
    come back and say it wasn't what you were thinking about, or that you
    assumed something was obvious. So if your idea of "abs" includes
    "abs(int_min) = int_min", that has to be in the definition.

    And note that computers /do/ implement abs functions where
    "abs(int_min)" is not defined. That is the standard definition and implementation for C. The representation does not come into it.

    Therefore the representation was an essential part - even if only as a
    model. Someone could pick a representation and define how abs would work
    on that representation - which is what you seemed to have done, above.

    No.

    It can make sense to pick a definition that works well with the
    implementations you have in mind. But it doesn't have to be that way.
    All definitions of "abs" I gave will work for any representation -
    whether it be two's complement, signed magnitude, BCD, or anything else.
    The two definitions I gave for fixed-size integers are geared towards
    two's complement - they'd be inefficient or unexpected with other representations - but they are not dependent on the representation.


    The point being that the /mathematical/ |x| is not what most computers implement. Because most computers use 2's complement representation they cannot do so.

    Two's complement has nothing to do with it. The finite size of
    computers and efficient computer integers means that the mathematics of computer integers differs from that of the standard mathematical set of integers. It does not mean the definitions are not mathematical, and it
    does not depend on the representation. (Though again, it makes sense to
    pick models and mathematical definitions that will work well in practice.)


    But we've discussed this somewhat to death. If you still believe
    computers implement mathematics consider defining

      sin(x)

    as a computer might implement it. As you know, the result of that
    function is necessarily not the sine of x but an approximation thereof.
    And, yes, you could create a mathematical specification for what a
    computer would do in all kinds of situation but it would still not be
    the mathematical sine operation.


    It would still be a mathematical definition. I can't see what is
    difficult to understand about this.

    As I've said more than once, I don't think you and I agree on the
    essentials. This is only about what one defined as 'mathematics' as
    opposed, I would argue, to 'engineering'. As with the integer cases I
    would call accounting for the sin(x) result's accuracy /engineering/
    just as I would a loss of info at the top end; they are both due to limitations imposed by the representation.


    Engineering, you could say, involves picking appropriate mathematical
    models and definitions that you can use in practice on real computers.
    It doesn't mean they are not mathematical - it means you are being
    practical about the mathematics you use.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Dmitry A. Kazakov on Sun Oct 10 17:13:47 2021
    On 10/10/2021 14:18, Dmitry A. Kazakov wrote:
    On 2021-10-10 13:02, David Brown wrote:
    On 09/10/2021 19:08, Dmitry A. Kazakov wrote:

    Yes, an educated guess would be, no effect, but this is only a guess.
    So, do not judge people too harsh.

    You don't need to guess.  You can measure, you can do research, you can
    do statistics, you can do experiments.

    Well, the point is that you cannot do any of these in a scientifically meaningful way for such broad category of effects and diverse samples.

    Here's a clue - anything that claims to "boost your immune system",
    doesn't work.  /Nothing/ will boost it if it is working properly, though
    there are plenty of ways to /weaken/ your immune system (nutritional
    deficiencies, stress, cold, infections, etc.).

    Once certainly can influence the immune system, e.g. by using immunosuppressants and vaccinations.

    Immunosuppressants weaken the immune system - they are just another
    method that could have gone in my list.

    Vaccinations do not boost your immune system. They /stimulate/ it - a
    totally different thing. The aim is to "teach" your adaptive immune
    response about a pathogen so that it can react faster - not stronger -
    to the specific target.


    And it's a good thing
    that no "immune system booster" works, because you don't want it boosted
    - that would be a guarantee of anaphylactic shock, autoimmune diseases,
    allergies, and many other unpleasantnesses.

    No, these are malfunctions of the immune system. "Boosting" means faster
    and stronger immune response without false positives. Wear larger,
    green, CO2 neutral magnets! (:-))


    A "stronger immune response" does not mean a stronger immune system. It
    means your /normal/ strength immune is responding in a targeted manner
    that is successful against the pathogen in question.

    If your immune system were to be "boosted" so that it gave a more
    aggressive system, it would always be a bad thing (compared to a normal, regulated and working system).

    Your innate immune system works by chemical warfare on invading
    pathogens (or the "enemy within", cancer). If it is "boosted" beyond
    the normal reaction, then the side-effects and fallout are unavoidably a problem. When the various white blood cells release their chemical
    weapons (such as oxidants), if they fire off more than necessary then
    they will not do more damage to the pathogens - they will damage
    surrounding tissue.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Sun Oct 10 17:28:10 2021
    On 2021-10-10 16:10, James Harris wrote:
    On 09/10/2021 11:30, David Brown wrote:

    But we've discussed this somewhat to death. If you still believe
    computers implement mathematics consider defining

      sin(x)

    as a computer might implement it. As you know, the result of that
    function is necessarily not the sine of x but an approximation thereof.
    And, yes, you could create a mathematical specification for what a
    computer would do in all kinds of situation but it would still not be
    the mathematical sine operation.

    Nobody does or needs that. The requirement of a machine implementation
    of sine simply states that the result machine value y is such that at
    least one of two adjacent intervals [y'Pred, y] and [y, y'Succ] contain
    sin(x). Here y'Pred denotes the previous machine number and y'Succ does
    the following one.

    The algorithms used to implement such requirements are 100% mathematics.
    Just search the Internet, you will find hundreds of papers on the topic.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Sun Oct 10 17:18:11 2021
    On 10/10/2021 16:01, David Brown wrote:
    On 10/10/2021 16:10, James Harris wrote:
    On 09/10/2021 11:30, David Brown wrote:
    On 08/10/2021 17:52, James Harris wrote:
    On 07/10/2021 12:32, David Brown wrote:
    On 06/10/2021 17:26, James Harris wrote:
    On 06/10/2021 07:20, David Brown wrote:
    On 05/10/2021 21:13, James Harris wrote:
    On 05/10/2021 18:35, David Brown wrote:

    ...

            abs(x) = ⎧ x, if x >= 0
                     ⎨
                     ⎩ -x, if x < 0

    ...

    But something else stood out particularly in the context of this >>>>>>>> subthread. The definition of abs(x) would fail on a computer if it >>>>>>>> were
    using 2's complement representation and x was the most negative >>>>>>>> number.
    It's a classic case in point where the computer won't follow the >>>>>>>> accepted mathematical definition.


    These were examples of function definitions with conditionals in >>>>>>> common
    mathematics using real numbers - they cannot be implemented
    directly in
    computer code.  If you want a mathematical definition of "abs" for >>>>>>> fixed
    size integer types in a programming language, you must adapt it to a >>>>>>> different mathematical definition that is suitable for the domain you >>>>>>> are using (i.e., the input and output sets are the range of your >>>>>>> computer type, rather than the real numbers).  It is, however, still >>>>>>> maths.  Two possibilities for n-bit two's complement signed integers >>>>>>> could be :

             abs(x) = ⎧ x, if x >= 0
                      ⎨ -x, if x < 0 and x > int_min >>>>>>>                   ⎩ int_min, if x = int_min

    Yes. I would consider that a valid and correct definition given the >>>>>> criteria. It describes what a programmer can expect from a computer's >>>>>> abs function (again, given the criteria).

    What criteria?

    Those above: 2's complement, etc.


    Again - you are mixing implementation and specification.

    On the contrary, part of the specification of the problem, at least as I
    had it in mind, was to define an abs which a computer might implement.

    The point of having a mathematical specification is that there is no
    such thing as what you had "in mind" - it is /exact/.

    Well, at the start of this subthread I said that your "definition of
    abs(x) would fail on a computer if it were using 2's complement
    representation and x was the most negative number." It's in the text,
    above. I would have thought that was clear enough that I was talking
    about 2's complement, at least.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to James Harris on Sun Oct 10 18:19:08 2021
    On 10/10/2021 15:10, James Harris wrote [to DB]:
    But we've discussed this somewhat to death. If you still believe
    computers implement mathematics consider defining
    sin(x)
    as a computer might implement it. As you know, the result of that
    function is necessarily not the sine of x but an approximation
    thereof. And, yes, you could create a mathematical specification for
    what a computer would do in all kinds of situation but it would still
    not be the mathematical sine operation.

    Further to Dmitri's reply -- As I have pointed out more than
    once [in recently in this thread] this whole topic is something that
    every mathematician and engineer ought to understand, viz [part of]
    numerical analysis. The ancient Greeks knew that; Newton was a
    major contributor, and so were most of the other best mathematicians
    of the 17th-19thC. Sadly, NA is, IME, rarely taught to programmers
    these days, as it is thought to be too mathematical. But that's the
    point -- it /is/ mathematics. Bart may not have studied it, you may
    not, but shedloads of people did, esp in the early days of computing;
    eg for one of the major early applications of computing, namely the
    preparation of [error free!] mathematical tables.

    Further, whereas you might be slightly lucky if

    real pi = 4 * arctan (1);
    print (sin(pi/3)^2)

    printed an exact representation of 3/4 rather than [say] 0.7499...9,
    you would be entitled to your money back if a symbolic algebra
    package gave anything other than 3/4 for the same expression.
    Modern symbolic packages know an amazing amount of maths, more
    than all but a handful of mathematicians in terms of being able
    to do calculus, algebra and numerical work generally.

    As I've said more than once, I don't think you and I agree on the
    essentials. This is only about what one defined as 'mathematics' as
    opposed, I would argue, to 'engineering'.

    Nothing wrong with engineering or being an engineer, but
    I suspect that Newton, Euler, Lagrange, Gauss and others would be
    slightly surprised if you told them you considered them to be
    engineers for doing NA [as opposed to for designing telescopes
    and railways, inter alia].

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Ravel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Andy Walker on Sun Oct 10 20:20:19 2021
    On 10/10/2021 18:19, Andy Walker wrote:
    On 10/10/2021 15:10, James Harris wrote [to DB]:
    But we've discussed this somewhat to death. If you still believe
    computers implement mathematics consider defining
       sin(x)
    as a computer might implement it. As you know, the result of that
    function is necessarily not the sine of x but an approximation
    thereof. And, yes, you could create a mathematical specification for
    what a computer would do in all kinds of situation but it would still
    not be the mathematical sine operation.

        Further to Dmitri's reply -- As I have pointed out more than
    once [in recently in this thread] this whole topic is something that
    every mathematician and engineer ought to understand, viz [part of]
    numerical analysis.  The ancient Greeks knew that;  Newton was a
    major contributor, and so were most of the other best mathematicians
    of the 17th-19thC.  Sadly, NA is, IME, rarely taught to programmers
    these days, as it is thought to be too mathematical.  But that's the
    point -- it /is/ mathematics.  Bart may not have studied it,

    It didn't really stop me implementing functions like sin and atan.

    you may
    not, but shedloads of people did, esp in the early days of computing;
    eg for one of the major early applications of computing, namely the preparation of [error free!] mathematical tables.

        Further, whereas you might be slightly lucky if

      real pi = 4 * arctan (1);
      print (sin(pi/3)^2)

    printed an exact representation of 3/4 rather than [say] 0.7499...9,

    I had to try it:

    C:\qapps>type t.q
    println sqr sin(pi/3)

    C:\qapps>qq t
    0.750000

    How about that? Of course, it helps that it rounds to 6 decimals. I only
    get 0.749... at 16 decimal places or more.

    you would be entitled to your money back if a symbolic algebra
    package gave anything other than 3/4 for the same expression.
    Modern symbolic packages know an amazing amount of maths, more
    than all but a handful of mathematicians in terms of being able
    to do calculus, algebra and numerical work generally.

    You said why; they're symbolic. They will keep an expression as an
    expression as much as possible, only applying reductions as needed, so
    that 'sqrt(a)**2' is just a, it won't evaluate sqrt(a) first.

    Presumably sin(pi/3) involves some sort of square root (sin 60° is
    0.866... which is sqrt(3)/2), so it all cancels out).

    It's conceivable here that an ordinary compiler can do this kind of
    simple analysis. But probably not worthwhile, if 99% of arguments to
    sin() will be variables not constants.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Mon Oct 11 13:24:17 2021
    On 10/10/2021 18:18, James Harris wrote:
    On 10/10/2021 16:01, David Brown wrote:
    On 10/10/2021 16:10, James Harris wrote:
    On 09/10/2021 11:30, David Brown wrote:
    On 08/10/2021 17:52, James Harris wrote:
    On 07/10/2021 12:32, David Brown wrote:
    On 06/10/2021 17:26, James Harris wrote:
    On 06/10/2021 07:20, David Brown wrote:
    On 05/10/2021 21:13, James Harris wrote:
    On 05/10/2021 18:35, David Brown wrote:

    ...

             abs(x) = ⎧ x, if x >= 0
                      ⎨
                      ⎩ -x, if x < 0

    ...

    But something else stood out particularly in the context of this >>>>>>>>> subthread. The definition of abs(x) would fail on a computer if it >>>>>>>>> were
    using 2's complement representation and x was the most negative >>>>>>>>> number.
    It's a classic case in point where the computer won't follow the >>>>>>>>> accepted mathematical definition.


    These were examples of function definitions with conditionals in >>>>>>>> common
    mathematics using real numbers - they cannot be implemented
    directly in
    computer code.  If you want a mathematical definition of "abs" for >>>>>>>> fixed
    size integer types in a programming language, you must adapt it >>>>>>>> to a
    different mathematical definition that is suitable for the
    domain you
    are using (i.e., the input and output sets are the range of your >>>>>>>> computer type, rather than the real numbers).  It is, however, >>>>>>>> still
    maths.  Two possibilities for n-bit two's complement signed
    integers
    could be :

              abs(x) = ⎧ x, if x >= 0
                       ⎨ -x, if x < 0 and x > int_min >>>>>>>>                    ⎩ int_min, if x = int_min >>>>>>>
    Yes. I would consider that a valid and correct definition given the >>>>>>> criteria. It describes what a programmer can expect from a
    computer's
    abs function (again, given the criteria).

    What criteria?

    Those above: 2's complement, etc.


    Again - you are mixing implementation and specification.

    On the contrary, part of the specification of the problem, at least as I >>> had it in mind, was to define an abs which a computer might implement.

    The point of having a mathematical specification is that there is no
    such thing as what you had "in mind" - it is /exact/.

    Well, at the start of this subthread I said that your "definition of
    abs(x) would fail on a computer if it were using 2's complement representation and x was the most negative number." It's in the text,
    above. I would have thought that was clear enough that I was talking
    about 2's complement, at least.


    Yes, I realise that - so I gave two definitions that were designed to
    work with 2's complement numbers of fixed sizes. They handled int_min
    in different but equally reasonable manners. The bit that was just in
    your head, and not specified, was how you wanted to deal with int_min -
    I believe you assumed that for two's complement, abs(int_min) should be int_min, and that is not a correct assumption.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to Bart on Tue Oct 12 00:06:19 2021
    On 10/10/2021 11:36, Bart wrote:
    The newsgroup is about languages (and language design I guess). Who
    knows what languages are used in those big applications.

    In the first place, anyone can look at [and indeed help with]
    the applications thus far mentioned; Firefox, Thunderbird and Gcc
    and others are open source or close relative thereto. Secondly, my
    main point was that with hardware support many of the commonest bugs
    could be detected much earlier, and in particular before they could
    be used by malware.

    Those big apps have endless problems which IMV are cause by being so
    large, complex and multi-layered.

    Indeed. But that's what happens when you let everyone pile
    in with their own ideas.

    (Eg. both Firefox and Opera [...])
    Writing simple software or keeping it small enough to keep on top of
    is another subject.

    Yes. But every user and his dog wants more features.

    [...]
    You really don't want the loop that turns "12345" into 12345 to cause
    a hardware interrupt or to require the language to create some
    exception when doing a*10+c.

    ??? I would if "a*10+c" exceeded "maxint" [assuming "int"
    to be the relevant type]. The arithmetic would overflow, which
    /should/ normally spring an overflow trap. It's up to you whether
    you program a preliminary check to prevent such overflows or don't
    bother and let the trap either terminate your program or divert to
    a user-defined trap routine.

    (Would you need language assistance also
    when encountering "123?5"? This is just lexing.)

    The point about /hardware/ support is that it's /not/
    language assistance, and requires no action on your part as a
    programmer or as a compiler writer. If you choose to take no
    action, then the program will simply terminate, presumably with
    some pre-defined message, if the overflow occurs. That's the
    way it used to be in the '60s!

    It's once those values have been loaded that the problems start. They
    might be within range of the the type, but might not be valid for the application (eg. they might need to be even numbers). And then there
    are the calculations that might be done.

    Yes, there are lots of possible errors, but /some/ of
    them can be detected automatically, and doing so could eliminate
    /many/ of the actual ways in which malware infects our computers.

    [...]
    If you do the rest of the job properly, then checking those machine
    limits becomes less important.

    Machine limits are only a tiny part of debugging, and
    especially of fire-proofing major applications. Things like
    buffer overflow, use of storage after freeing it, writing
    outside an array, using uninitialised storage, dereferencing
    null pointers, ... are not to do with machine limits, but
    could all be detected by suitable hardware, esp [but not only]
    in symbiosis with compilers.

    Or to put it another way, I don't want to invest too much effort and sacrifice too much efficiency for that tiny benefit.

    Sadly, that attitude is all too common. Which is why
    doing as much as possible in hardware saves all of us from
    sacrificing /any/ efficiency.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Boccherini

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to All on Tue Oct 12 00:37:34 2021
    On 10/10/2021 20:20, Bart wrote:
    [I wrote:]
         Further, whereas you might be slightly lucky if
       real pi = 4 * arctan (1);
       print (sin(pi/3)^2)
    printed an exact representation of 3/4 rather than [say] 0.7499...9,
    I had to try it:
      C:\qapps>type t.q
      println sqr sin(pi/3)
      C:\qapps>qq t
      0.750000
    How about that? Of course, it helps that it rounds to 6 decimals. I
    only get 0.749... at 16 decimal places or more.

    Well, yes; it would be a surprise if such a simple
    calculation wasn't correct to 6dp on your computer! On mine:

    $ a68g -p "sin(4*arctan(1)/3)^2 - 3/4"
    -1.11022302462516e -16

    But James's claim was that

    " If you still believe computers implement mathematics
    " consider defining
    " sin(x)
    " as a computer might implement it. As you know, the
    " result of that function is necessarily not the sine
    " of x but an approximation thereof. "

    and even the simplest of symbolic algebra packages should be
    well capable of handling "pi", and the sine of angles closely
    related to it, /exactly/, in the same way that a [proficient]
    human would, and not "merely" to 15dp [or 60dp or 10000dp].

    you would be entitled to your money back if a symbolic algebra
    package gave anything other than 3/4 for the same expression.
    Modern symbolic packages know an amazing amount of maths, more
    than all but a handful of mathematicians in terms of being able
    to do calculus, algebra and numerical work generally.
    You said why; they're symbolic.

    I didn't ask why. The point is that computers can be
    programmed to do mathematics exactly, not "necessarily ... am
    approximation", even when dealing with R or C, the "real" or
    "complex" numbers. We /choose/ to use floating point numbers
    instead for many/most purposes; that choice was controversial
    in the 1940s through to around 1960, which is largely forgotten
    these days. Similarly, we /choose/ to use binary f-p numbers,
    and other number formats have been largely forgotten, though
    some of them have interesting properties and uses.

    They will keep an expression as an
    expression as much as possible, only applying reductions as needed,
    so that 'sqrt(a)**2' is just a, it won't evaluate sqrt(a) first.

    Yes, that's the point of the package!

    Presumably sin(pi/3) involves some sort of square root (sin 60° is
    0.866... which is sqrt(3)/2), so it all cancels out).

    Yes; 180 degrees is pi radians.

    It's conceivable here that an ordinary compiler can do this kind of
    simple analysis.

    I wouldn't want "an ordinary compiler" to do that! It
    shouldn't be treating library functions differently from user-
    defined functions, eg in case of problems while debugging.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Bucalossi

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Andy Walker on Tue Oct 12 17:41:36 2021
    On 12/10/2021 00:06, Andy Walker wrote:
    On 10/10/2021 11:36, Bart wrote:
    The newsgroup is about languages (and language design I guess). Who
    knows what languages are used in those big applications.

        In the first place, anyone can look at [and indeed help with]
    the applications thus far mentioned;  Firefox, Thunderbird and Gcc
    and others are open source or close relative thereto.

    They are so big and complex that they might as well be closed source.

    You really don't want the loop that turns "12345" into 12345 to cause
    a hardware interrupt or to require the language to create some
    exception when doing a*10+c.

        ???  I would if "a*10+c" exceeded "maxint" [assuming "int"
    to be the relevant type].  The arithmetic would overflow, which
    /should/ normally spring an overflow trap.  It's up to you whether
    you program a preliminary check to prevent such overflows or don't
    bother and let the trap either terminate your program or divert to
    a user-defined trap routine.

    No need. For turning text into binary, the necessary checks can be done
    on the string.


                      (Would you need language assistance also >> when encountering "123?5"? This is just lexing.)

        The point about /hardware/ support is that it's /not/
    language assistance, and requires no action on your part as a
    programmer or as a compiler writer.  If you choose to take no
    action, then the program will simply terminate, presumably with
    some pre-defined message, if the overflow occurs.  That's the
    way it used to be in the '60s!

    What you seem to want in hardware support is to generate interrupts,
    which is a really heavy-duty way to deal with such matters, and quite
    difficult to use effectively from a language (eg. I've got no idea how
    to trap them from mine).

    And unless there's some way of selectively disabling it, it means
    programs at risk of crashing, and losing customer's work and data, for something that might be harmless.

    On x86, the only thing that works like that for arithmetic ops, is divide-by-zero, which I also think is over-the-top.

    (That I very rarely see it, despite not guarding every div op in the
    program, suggests it is a non-problem.)



        Machine limits are only a tiny part of debugging, and
    especially of fire-proofing major applications.  Things like
    buffer overflow, use of storage after freeing it, writing
    outside an array, using uninitialised storage, dereferencing
    null pointers, ... are not to do with machine limits, but
    could all be detected by suitable hardware, esp [but not only]
    in symbiosis with compilers.

    Or to put it another way, I don't want to invest too much effort and
    sacrifice too much efficiency for that tiny benefit.

        Sadly, that attitude is all too common.  Which is why
    doing as much as possible in hardware saves all of us from
    sacrificing /any/ efficiency.

    But you've just said that the hardware can only do a small part!

    Take array bounds: your runtime requests of a pool of virtual memory
    from the OS. Part of that is used to allocate the array.

    Access outside the array bounds, it will still be inside the
    OS-allocated block, so the hardware check is no use here. Not unless
    you're well outside the array.

    There are a million things the hardware will not know about and can't check.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to Bart on Wed Oct 13 00:19:33 2021
    On 12/10/2021 17:41, Bart wrote:
    The newsgroup is about languages (and language design I guess). Who
    knows what languages are used in those big applications.
         In the first place, anyone can look at [and indeed help with]
    the applications thus far mentioned;  Firefox, Thunderbird and Gcc
    and others are open source or close relative thereto.
    They are so big and complex that they might as well be closed source.

    Well, you asked "Who knows what languages are used in those
    big applications". If you don't want actually to look, then GIYF.
    Top answer for [eg] Firefox:
    " Rendering engine:
    " Gecko, C++, and in recent versions Rust language used too
    " JavaScript engine:
    " SpiderMonkey, C
    " UI:
    " Mostly XUL (a custom XML dialect), CSS, and JavaScript, with
    " some C++. "

    You really don't want the loop that turns "12345" into 12345 to cause
    a hardware interrupt or to require the language to create some
    exception when doing a*10+c.
         ???  I would if "a*10+c" exceeded "maxint" [assuming "int"
    to be the relevant type].  The arithmetic would overflow, which
    /should/ normally spring an overflow trap.  It's up to you whether
    you program a preliminary check to prevent such overflows or don't
    bother and let the trap either terminate your program or divert to
    a user-defined trap routine.
    No need. For turning text into binary, the necessary checks can be
    done on the string.

    In which case the trap won't be sprung. It costs you nothing,
    so you don't need even to think about it.
    [...]
    What you seem to want in hardware support is to generate interrupts,
    which is a really heavy-duty way to deal with such matters, and quite difficult to use effectively from a language (eg. I've got no idea
    how to trap them from mine).

    On the contrary, it's as lightweight as it is possible to be.
    You don't have to do anything at all to your program. If nothing
    goes wrong, your program will run exactly as if the relevant trap
    wasn't there. Otherwise, the check you forgot/declined to put in
    will effectively be activated on your behalf, with no action on
    your part.

    And unless there's some way of selectively disabling it, it means
    programs at risk of crashing, and losing customer's work and data,
    for something that might be harmless.

    It /might/ be harmless. How would anyone know? You asked
    the computer to work out, in your example, "a*10+c", and it failed
    to do so. If you don't care what "a*10+c" is, why is that code in
    there in the first place? If you do care, then shouldn't you be
    told that the computer is giving you the wrong answer? Are there
    /any/ plausible and competent programming circumstances in which
    production code [ie other than when programs are under development
    and you actively want to know about any bugs] should be doing
    important calculations where you don't care whether the answer is
    right or not? Do your customers know?

    [...]
         Machine limits are only a tiny part of debugging, and
    especially of fire-proofing major applications.  Things like
    buffer overflow, use of storage after freeing it, writing
    outside an array, using uninitialised storage, dereferencing
    null pointers, ... are not to do with machine limits, but
    could all be detected by suitable hardware, esp [but not only]
    in symbiosis with compilers.
    Or to put it another way, I don't want to invest too much effort and
    sacrifice too much efficiency for that tiny benefit.
         Sadly, that attitude is all too common.  Which is why
    doing as much as possible in hardware saves all of us from
    sacrificing /any/ efficiency.
    But you've just said that the hardware can only do a small part!

    No, I didn't. What I said is still there above.

    Take array bounds: your runtime requests of a pool of virtual memory
    from the OS. Part of that is used to allocate the array.
    Access outside the array bounds, it will still be inside the
    OS-allocated block, so the hardware check is no use here. Not unless
    you're well outside the array.

    Wrong sort of hardware check. A more useful one is for a
    request for storage for an array to return a "fat" pointer, ie a
    pointer together with bounds. The hardware then checks whether
    any offset to that pointer stays within bounds. That's what some
    early computers did; and it gives you array bound checks "free".
    If you forget/decline to do the checking yourself, then a trap
    will be sprung if you access out of bounds. If you put in the
    appropriate checks anyway, then an optimising compiler could
    decide whether to leave them in [still no performance hit over
    not having such hardware] or, in suitable cases, to replace
    them by supplying a more appropriate trap action than "crash
    and burn". Either way, no more overflowed buffers.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Bucalossi

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Bart on Wed Oct 13 12:13:49 2021
    On 13/10/2021 12:01, Bart wrote:

    If people complain that 'print 2**3**4' displays 0 instead of 2417851639229258349412352, then I just play the UB card like C does.
    Users of the language need to take care around those limits.

    BTW, A68G displays 4096 for the result of 2**3**4. Now /that/ is
    unexpected! It uses the wrong precedence for **. With parentheses added,
    it does report an overflow.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Andy Walker on Wed Oct 13 12:01:16 2021
    On 13/10/2021 00:19, Andy Walker wrote:
    On 12/10/2021 17:41, Bart wrote:

    No need. For turning text into binary, the necessary checks can be
    done on the string.

        In which case the trap won't be sprung.  It costs you nothing,
    so you don't need even to think about it.

    You do need to think about it. My apps used to consist of a compiled
    main application, with hot-loaded bytecode modules to execute user-commands.

    If a module went wrong (say, the interpeter detected out-of-bounds array accesses), it would then terminate that module and return to normal
    input, requesting the next command (actually, another intepreted module
    with an input loop, this in a GUI app).

    What you don't want is some silly overflow in a module (which may be
    written in this scripting language by a third party), from bringing down
    the entire application, and losing the user's work for that day.

    Modern apps aren't just doing one big calculation, like feeding a list
    of cards into Multivac (Asimov's giant computer), and some time later,
    spitting out a card with the answer.

    And unless there's some way of selectively disabling it, it means
    programs at risk of crashing, and losing customer's work and data,
    for something that might be harmless.

        It /might/ be harmless.  How would anyone know?  You asked
    the computer to work out, in your example, "a*10+c", and it failed
    to do so.  If you don't care what "a*10+c" is, why is that code in
    there in the first place?

    So we're back to this. a*10+c isn't some theoretical mathematical term
    which is always going to have some exact numeric result, so long as you
    don't actually have to evaluate it.

    It's doing the calculation within the limitations of, say, the 64-bit
    ALU of a processor which represents integers using two's complement.

    Then how it behaves near the boundaries of those limitations, how much
    it insulates those realities from the user, is a choice involving both
    the language and application.

    (Most people have little control over the former; only over the latter. Unusually I'm in charge of both.)

    My approach is lax, but that's OK because I'm not writing programs to
    send rockets into space (and back).

    There so many other of things of greater impact on my programs.

    If people complain that 'print 2**3**4' displays 0 instead of 2417851639229258349412352, then I just play the UB card like C does.
    Users of the language need to take care around those limits.

    If they really need that result, I would suggest using the bignum
    feature of my other language.

    I understand that if /you/ were implement a language, it would do things perfectly. It would display 2**3**4**5 exactly (even if it takes a
    while, or exhausts your machine's memory). I tend to cut a few corners.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to Bart on Wed Oct 13 23:12:59 2021
    On 13/10/2021 12:13, Bart wrote:
    BTW, A68G displays 4096 for the result of 2**3**4. Now /that/ is
    unexpected!

    To you, perhaps. But it's in accord with the Report, and
    with the precedents of Algol 60 and IAL, so it's been "expected"
    for over 60 years. Traditional mathematics has no equivalent, as exponentiation is normally indicated by indexes, not by operators,
    and many languages don't implement exponentiation operators either
    [preferring to use functions instead] so there is no solid general
    prior expectation to guide us.

    It uses the wrong precedence for **.

    If you meant that it associates the same way around as
    other operators, that may not be your expectation but Algol took
    the view that consistency was more important. Operators of equal
    precedence always associate such that "a op b op c" means
    "(a op b) op c". You could make a case for it always to mean
    "a op (b op c)" [tho' then (eg) "a-b-c" would surprise most of
    us], but you would be hard pushed to make a coherent case for
    mixing them. You would have to add new syntax, and it would be
    quite hard to read/express.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Valentine

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Andy Walker on Thu Oct 14 00:33:48 2021
    On 13/10/2021 23:12, Andy Walker wrote:
    On 13/10/2021 12:13, Bart wrote:
    BTW, A68G displays 4096 for the result of 2**3**4. Now /that/ is
    unexpected!

        To you, perhaps.  But it's in accord with the Report, and
    with the precedents of Algol 60 and IAL, so it's been "expected"
    for over 60 years.  Traditional mathematics has no equivalent, as exponentiation is normally indicated by indexes, not by operators,
    and many languages don't implement exponentiation operators either [preferring to use functions instead] so there is no solid general
    prior expectation to guide us.

    I think it has. The operators are implicit: 3x² has always been 3*(x**2)
    or 3*(x^2).


              It uses the wrong precedence for **.

        If you meant that it associates the same way around as
    other operators, that may not be your expectation but Algol took
    the view that consistency was more important.  Operators of equal
    precedence always associate such that "a op b op c" means
    "(a op b) op c".  You could make a case for it always to mean
    "a op (b op c)" [tho' then (eg) "a-b-c" would surprise most of
    us], but you would be hard pushed to make a coherent case for
    mixing them.  You would have to add new syntax, and it would be
    quite hard to read/express.

    If I type 2**3**4 or 2^3^4 into Google, it gives me the larger value (so
    using using right-to-left precedence) rather than the smaller.

    So that appears to be the expectation these days.

    Also, if I do:

    print *, 2**3**4

    in Fortran (via rextester.com) it appears to parse as 2**(3**4) too.
    Fortran is even older than Algol (I don't know if at some point it did something different; but it's unlikely they would have changed it).

    BTW that Fortran displays 0; another language, and a very famous one,
    that treats overflow lightly.

    With Ada, it says it needs parentheses for Ada. With those in place, it
    reports a range error. It does it both for compile-time, runtime
    (constraint error) and can warn

    So I guess if you're worried about this stuff, then use Ada. My language
    isn't Ada.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Andy Walker on Thu Oct 14 09:20:25 2021
    On 2021-10-14 00:12, Andy Walker wrote:

        If you meant that it associates the same way around as
    other operators, that may not be your expectation but Algol took
    the view that consistency was more important.  Operators of equal
    precedence always associate such that "a op b op c" means
    "(a op b) op c".  You could make a case for it always to mean
    "a op (b op c)" [tho' then (eg) "a-b-c" would surprise most of
    us], but you would be hard pushed to make a coherent case for
    mixing them.

    In Ada a**b**c is syntax error as well as -a**b. You must disambiguate
    such stuff.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to Bart on Thu Oct 14 21:31:18 2021
    On 13/10/2021 12:01, Bart wrote:
    No need. For turning text into binary, the necessary checks can be
    done on the string.
         In which case the trap won't be sprung.  It costs you nothing,
    so you don't need even to think about it.
    You do need to think about it. My apps used to consist of a compiled
    main application, with hot-loaded bytecode modules to execute
    user-commands.
    If a module went wrong [...].
    What you don't want is some silly overflow in a module (which may be
    written in this scripting language by a third party), from bringing
    down the entire application, and losing the user's work for that
    day.

    Repeat -- /if/ you have done the checks, /then/ the trap
    won't be sprung [and it has cost you nothing and it doesn't matter
    what would have happened if the trap had been sprung]. If your
    users care what their work actually does, then an overflow is not
    "silly", it means they are getting the wrong answers. It must be
    a bizarre application if getting wrong answers doesn't matter.
    Otherwise, you either [again] build in the checks, and the trap
    won't be sprung, /or/ you spend a few moments /once/ writing a
    trap handler [which in this case would presumably be "print an
    error message and go to the end of this module"]. It's easier
    to write one simple trap handler than to add checks to all the
    modules as and when you import them. Shell scripts [eg] can use
    the "trap" command; C programs can use the stuff defined [eg]
    in N1570 section 7.14 and/or Annex H [on "language independent
    arithmetic", which might interest James].

    [...]
         It /might/ be harmless.  How would anyone know?  You asked
    the computer to work out, in your example, "a*10+c", and it failed
    to do so.  If you don't care what "a*10+c" is, why is that code in
    there in the first place?
    So we're back to this. a*10+c isn't some theoretical mathematical
    term which is always going to have some exact numeric result, so long
    as you don't actually have to evaluate it.
    It's doing the calculation within the limitations of, say, the 64-bit
    ALU of a processor which represents integers using two's complement.> Then how it behaves near the boundaries of those limitations, how
    much it insulates those realities from the user, is a choice
    involving both the language and application.

    Yes, but when "a*10+c" overflows, you [apparently] aren't
    checking, nor are you trapping the operations, so your users have
    no idea that their answers are completely bogus. Yes, it's your
    choice; I hope your users know what your decision was, and know
    not to use your code to work out their payroll ....

    If people complain that 'print 2**3**4' displays 0 instead of 2417851639229258349412352, then I just play the UB card like C does.

    C doesn't. Some implementations of C do. Guess what,
    those are the implementations that David and I complain about.
    C can be made to behave properly [see the above-mentioned parts
    of N1570, for example]; but it's harder than it should be.

    I understand that if /you/ were implement a language, it would do
    things perfectly.

    Thanks for your confidence.

    It would display 2**3**4**5 exactly (even if it
    takes a while, or exhausts your machine's memory).

    Doesn't need me to implement anything:

    $ a68g -p 2**3**4**5
    a68g: runtime error: 1: INT math error [...]
    $ a68g -p "LONG 2**3**4**5"
    +1152921504606846976

    I tend to cut a
    few corners.

    Yes, we'd noticed. Nothing wrong with cutting corners
    as long as everyone concerned knows that's how you operate and
    is happy that the computing work they are paying for is of that
    standard.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Forbes

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to All on Thu Oct 14 23:10:40 2021
    On 14/10/2021 00:33, Bart wrote:
    [I wrote:]
    [...] Traditional mathematics has no equivalent, as
    exponentiation is normally indicated by indexes, not by operators,
    and many languages don't implement exponentiation operators either
    [preferring to use functions instead] so there is no solid general
    prior expectation to guide us.
    I think it has. The operators are implicit: 3x² has always been
    3*(x**2) or 3*(x^2).

    That doesn't help to determine what "a**b**c" ought to
    mean, as that expression is not traditional mathematics and has
    no equivalent therein. Maths instead uses a hierarchy of super-
    and sub-scripts using smaller or larger type, which supplies the disambiguation.

    If I type 2**3**4 or 2^3^4 into Google, it gives me the larger value
    (so using using right-to-left precedence) rather than the smaller.
    So that appears to be the expectation these days.
    Also, if I do:
    print *, 2**3**4
    in Fortran (via rextester.com) it appears to parse as 2**(3**4) too.
    Fortran is even older than Algol (I don't know if at some point it
    did something different; but it's unlikely they would have changed
    it).

    Very possibly, but irrelevant. In the early days of
    computing, there were no accepted standards, and language
    designers made their own decisions, so there are examples
    of L->R associativity and of R->L associativity. There are
    payoffs either way. Early Algols and derivatives chose to
    go for the simpler syntax. By Algol 68 it would have been
    very difficult [and error prone] to define operators with
    variable associativity, so "a**b**c" virtually has to
    associate the same say as "a-b-c" [which /does/ have a long-
    established traditional meaning], bearing in mind that all
    A68 operators are defined via a simple uniform syntax.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Forbes

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Andy Walker on Thu Oct 14 23:50:24 2021
    On 14/10/2021 21:31, Andy Walker wrote:
    On 13/10/2021 12:01, Bart wrote:
    No need. For turning text into binary, the necessary checks can be
    done on the string.
         In which case the trap won't be sprung.  It costs you nothing, >>> so you don't need even to think about it.
    You do need to think about it. My apps used to consist of a compiled
    main application, with hot-loaded bytecode modules to execute
    user-commands.
    If a module went wrong [...].
    What you don't want is some silly overflow in a module (which may be
    written in this scripting language by a third party), from bringing
    down the entire application, and losing the user's work for that
    day.

        Repeat -- /if/ you have done the checks, /then/ the trap
    won't be sprung [and it has cost you nothing and it doesn't matter
    what would have happened if the trap had been sprung].  If your
    users care what their work actually does, then an overflow is not
    "silly", it means they are getting the wrong answers.  It must be
    a bizarre application if getting wrong answers doesn't matter.
    Otherwise, you either [again] build in the checks, and the trap
    won't be sprung, /or/ you spend a few moments /once/ writing a
    trap handler [which in this case would presumably be "print an
    error message and go to the end of this module"].  It's easier
    to write one simple trap handler than to add checks to all the
    modules as and when you import them.  Shell scripts [eg] can use
    the "trap" command;  C programs can use the stuff defined [eg]
    in N1570 section 7.14 and/or Annex H [on "language independent
    arithmetic", which might interest James].

    [...]
         It /might/ be harmless.  How would anyone know?  You asked
    the computer to work out, in your example, "a*10+c", and it failed
    to do so.  If you don't care what "a*10+c" is, why is that code in
    there in the first place?
    So we're back to this. a*10+c isn't some theoretical mathematical
    term which is always going to have some exact numeric result, so long
    as you don't actually have to evaluate it.
    It's doing the calculation within the limitations of, say, the 64-bit
    ALU of a processor which represents integers using two's complement.>
    Then how it behaves near the boundaries of those limitations, how
    much it insulates those realities from the user, is a choice
    involving both the language and application.

        Yes, but when "a*10+c" overflows, you [apparently] aren't
    checking, nor are you trapping the operations, so your users have
    no idea that their answers are completely bogus.  Yes, it's your
    choice;  I hope your users know what your decision was, and know
    not to use your code to work out their payroll ....

    If people complain that 'print 2**3**4' displays 0 instead of
    2417851639229258349412352, then I just play the UB card like C does.

        C doesn't.  Some implementations of C do.  Guess what,
    those are the implementations that David and I complain about.
    C can be made to behave properly [see the above-mentioned parts
    of N1570, for example];  but it's harder than it should be.

    I understand that if /you/ were implement a language, it would do
    things perfectly.

        Thanks for your confidence.

                It would display 2**3**4**5 exactly (even if it
    takes a while, or exhausts your machine's memory).

        Doesn't need me to implement anything:

      $ a68g -p 2**3**4**5
      a68g: runtime error: 1: INT math error [...]
      $ a68g -p "LONG 2**3**4**5"
                      +1152921504606846976

                                 I tend to cut a
    few corners.

        Yes, we'd noticed.  Nothing wrong with cutting corners
    as long as everyone concerned knows that's how you operate and
    is happy that the computing work they are paying for is of that
    standard.

    Actually I mostly used floats in my main application (2D and 3D drafting
    and modelling). (That is, float64.)

    Those have a different set of characteristics, and are known to be
    generally approximations. So overflow might yield Infinity, that can be
    tested for.

    One of my programs was used for payroll too, where I used floats as
    well, although now I'd use my decimal float type. (Which doesn't have
    any limits to range or precision that would apply to a real-world
    calculation.)

    When it comes to integers, both languages offer int64 as minimum (many languages are still stuck on int32). The next step after that is int128
    on one, or the aforementioned big-decimal on the other. So plenty of
    capacity for real-world calculations without running into overflow, eg. counting the number of views of a youtube video...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Andy Walker on Fri Oct 15 00:32:32 2021
    On 14/10/2021 21:31, Andy Walker wrote:
    On 13/10/2021 12:01, Bart wrote:

    I understand that if /you/ were implement a language, it would do
    things perfectly.

        Thanks for your confidence.

                It would display 2**3**4**5 exactly (even if it
    takes a while, or exhausts your machine's memory).

        Doesn't need me to implement anything:

      $ a68g -p 2**3**4**5
      a68g: runtime error: 1: INT math error [...]
      $ a68g -p "LONG 2**3**4**5"
                      +1152921504606846976

    With the correct grouping, I get an overflow error:

    a68 -p "LONG 2**(3**(4**5))"

    .... (result too large) ....

    If I try to work that out on Python, eventually it says:

    MemoryError

    This calculation is equivalent to:

    2**37339184874102004353295975418486658822540977678373400775063693172207904 06172652512299936889388039772204687650654314751581087270545921608585813513 36982809187314191748594262580938807019951956404285571818041046681288797402 92551766801234061729839657473161915238672304623512593489605859058828465479 35405059362023765478074427305821445270589887562514528177934133521419207446 23027518729185432862375737063985485319476416926263819972887006907013899256 524297198527698749274196276811060702333710356481

    I believe this needs 3.7e488 bits to represent; my machine contains only
    7e10 bits of memory.

    But assuming a calculation which is large, but fits into memory. If
    these are compile-time expressions in a language with the requisite
    types, how much time should a compiler spend evaluating them; how big
    should a binary be allowed to get to represent these reduced/expanded calculations?

    This is why I don't do bignum reduction in a compiler. At runtime, it
    will only spend time on the values it will actually encounter. Also why
    I don't evaluate things like "A"*1 billion at compile-time; each
    expression would take 6 seconds to work out; 1GB of memory; and would
    add 1GB to the binary. And there might be hundreds.

    Yet at runtime, it might not even be used.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Andy Walker on Sat Oct 23 11:18:45 2021
    On 14/10/2021 21:31, Andy Walker wrote:

    ...

    C programs can use the stuff defined [eg]
    in N1570 section 7.14 and/or Annex H [on "language independent
    arithmetic", which might interest James].

    Yes, interesting stuff which I didn't know about. The wording is largely
    for 'language lawyers', though. I'd rather specify such operations in a
    more user-friendly way that would work for everyone including
    non-mathematician newbie programmers.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Sat Oct 23 12:05:17 2021
    On 22/08/2021 14:50, David Brown wrote:
    On 21/08/2021 21:31, James Harris wrote:

    ...

    Isn't it odd to have the maximum size of a file decided by a language's
    implementation (or an OS's implementation)?

    It is the OS's choice, not the language.

    What I am saying is that it should be neither. The max file size is
    nothing to do with either language or OS but a property of a filesystem.

    What you are thinking of is an OS /imposing/ a limit. That's fine;
    that's what OSes tend to do. And their ABIs expose that OS-decided limit
    to programs by including it in their API calls.

    But OSes do not need to do that. In fact, it's a bad design to do so.

    ...

    If an OS does not support file systems that can handle files bigger than
    4 GB, why should it make every file operation bigger and slower with a pointlessly large "off_t" size?

    It shouldn't. Making off_t larger and larger just makes the problem
    worse - like getting the largest and heaviest hammer possible and using
    that one hammer to knock in every nail.

    ...

    Put another way, the C/Posix or whatever concept of off_t seems to me to
    be broken.


    It only seems that way because you don't understand it.


    That's rich coming from you.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Sat Oct 23 11:37:58 2021
    On 09/09/2021 13:43, Bart wrote:
    On 09/09/2021 11:57, James Harris wrote:

    ...

    1. create a .c file with the required #includes
    2. run it through the target's preprocessor
    3. parse the output to extract the data I needed
    4. store the extracted data in a configuration file for the target
    5. use the configuration file to set up my own types and structures
    for the target environment.

    Further, since I may not even have access to a given target
    environment, if the above process was unable to parse anything it
    needed to I'd have the parser produce a report of what it could not
    handle for sending back to me so I could update the parser or take
    remedial steps.

    As the end of the day, I thought you were lamenting that there's no
    master config file and all info is in C headers. The above steps are
    intended to remedy that and create the master config file I thought
    you wanted.

    In general, the process is non-trivial, even if you have a C compiler
    that can successfully process the headers (which itself can be
    problematical as it can have extra dependencies).

    For example, at some point, something is defined in terms of C
    executable code, and not declarations. Now you are having to translate
    chunks of program code.

    The goal, AISI, is not to translate everything but to /extract/ info you
    need for your 'master config file' such as the stat table layout.


    It is also an unreasonable degree of effort. I think most people would
    rather buy fish in a supermarket than having to lease a North Sea
    trawler to find it themselves!

    That hardly applies. The reality is that you have systems for which
    there is no master config file. None. Nada. It does not exist.

    Yet if someone has created a C implementation for that environment then
    there is hope that with a bit of work you could use C to extract the
    info you need. Blaming C is like blaming the messenger.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Sat Oct 23 12:44:14 2021
    On 29/08/2021 22:24, David Brown wrote:
    On 29/08/2021 19:21, James Harris wrote:
    On 29/08/2021 15:50, David Brown wrote:
    On 29/08/2021 13:47, James Harris wrote:

    ...

    The limit of a file's size would naturally be defined by the filesystem >>>> on which it was stored or on which it was being written. Such a value
    would be known by, and a property of, the FS driver.


    "Proof by repetitive assertion" is not convincing.

    There's nothing to prove. It is simply factual (and well known) that
    different filesystems have different maximum file sizes. FAT12 has
    different limits from FAT32, for example. Ergo, the maximum permitted
    file size /is/ a natural property of the formatted filesystem. I guess
    that's repetitive again but I cannot imagine what you think would need
    to be added to that to establish the point.

    Of course different filesystems have different maximum file sizes - no
    one has disputed that! All that is in dispute is your silly idea that
    the /OS/ does not have limits here.

    I never said that many current OSes do not have limits. It might help if
    you understood a proposal before disagreeing with it!

    Many or most OSes may impose limits but they don't have to: such a
    design is not mandatory.

    Many of those OSes need to be shut down and restarted in order to apply updates. That may be what most people are familiar with but, again, OSes
    do not have to work that way. I would pejoratively call them toy OSes
    because they are not well designed. Toys can be commercially successful
    but that does not make them machines.

    To be clear about where I am coming from, AISI one should be able to
    start an OS today and leave it running for years, including adding
    future filesystems to it without shutting it down.

    More important, here, an /application/ which uses file offsets should
    not have to be recompiled - or even restarted - when new filesystems
    (which have larger max file sizes) are released.

    ...

    What systems use file sizes that are smaller than "types smaller that
    32-bit" ?

    I thought you were the microcontroller man!

    I am. What type would you use for file sizes here?

    I'd point at them. For example,

    file_seek(file, &offsetp, whence)

    ...

    Something Dmitry said makes, I think, this easier to explain. In an OO
    language you could think of the returns from your functions as objects.
    The objects would be correctly sized for the filesystems to which they
    related. They could have different classes but they would all respond
    polymorphically to the same methods. The class of each would know the
    maximum permitted offset.


    Right. Arbitrary precision integers. They are written in a nice OO
    language so that the language hides the ugly mechanics of allocations, deallocations, memory management, etc., and gives you operator overloads instead of ugly prefix function calls or macros for everything.

    But they are still arbitrary precision integers if you refuse to allow limits. And they are still massively less efficient than a simple large integer of fixed size.

    To be clear, AISI offsets would not be bignums where the size of each
    varies dynamically. Instead, each would have a fixed width and that
    width would not change.

    ...

    Again, I have to disagree. The question is: What defines how large a
    file's offset can be?

    The answer is just as simple: Each filesystem has its own range of max >>>> sizes.


    Your concept of "basic engineering" is severely lacking here.

    Oh? What, specifically, is lacking?


    Supposing someone asked you to build a bridge for a two-lane road
    passing over a river. Basic engineering is to look at the traffic on
    the road, the size of the crossing, and figure out a reasonable maximum
    load weight that could realistically be on the bridge at any given time.
    Then you extrapolate for future growth based on the best available data
    and predictions. Then you multiply and add in safety factors. You tell
    the town planners that the bridge should have a total weight limit of,
    say, 100 tons and you tell them the price.

    That is basic engineering.

    :-)

    Your analogy is inapt. A closer fit to what we have been talking about
    is you building a 4-lane highway to every island regardless of how large
    or small the island is. Your bridges would be too small for some islands
    and too big for others.

    I am suggesting each bridge should suit the use case.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to All on Sat Oct 23 14:20:15 2021
    On 23/10/2021 11:18, James Harris wrote:
    [I wrote, a propos traps:]
    C programs can use the stuff defined [eg]
    in N1570 section 7.14 and/or Annex H [on "language independent
    arithmetic", which might interest James].
    Yes, interesting stuff which I didn't know about. The wording is
    largely for 'language lawyers', though. I'd rather specify such
    operations in a more user-friendly way that would work for everyone
    including non-mathematician newbie programmers.

    You seem to be confusing a language specification with a
    language primer or even advanced text. The specification must be
    sufficiently precise that no two civilised readers can disagree
    about what it means; ordinary programmers who find that hard to
    understand can look instead at books that explain what is going
    on. You don't learn C from N1570; but you might write compilers
    from it.

    That's the mistake people made with Algol 68; they took
    one look at the Report, and instead of looking at the examples
    in Chapter 11 [which were largely the same as the examples in
    the Algol 60 Reports, but looked nicer and more straightforward]
    they dived straight in to hypernotions and paranotions and such-
    like. Those are completely impenetrable to almost everyone, and
    it was assumed that the impenetrability was part of the language.

    Sadly, since then, few people have even tried to define
    languages formally. The result is all too predictable. C is
    defined better than most, and has gone through several revisions;
    but still there are debates in "c.s.c" about what various things
    mean, the most recent being what N2596 means for initialising
    anonymous structures and unions. Most languages are defined
    largely by "this is what loops [or whatever] look like, and
    use your common sense to work out what it all means".

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Boccherini

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Andy Walker on Sat Oct 23 16:06:09 2021
    On 23/10/2021 14:20, Andy Walker wrote:
    On 23/10/2021 11:18, James Harris wrote:
    [I wrote, a propos traps:]

    C programs can use the stuff defined [eg]
    in N1570 section 7.14 and/or Annex H [on "language independent
    arithmetic", which might interest James].

    Yes, interesting stuff which I didn't know about. The wording is
    largely for 'language lawyers', though. I'd rather specify such
    operations in a more user-friendly way that would work for everyone
    including non-mathematician newbie programmers.

        You seem to be confusing a language specification with a
    language primer or even advanced text.  The specification must be sufficiently precise that no two civilised readers can disagree
    about what it means;  ordinary programmers who find that hard to
    understand can look instead at books that explain what is going
    on.  You don't learn C from N1570;  but you might write compilers
    from it.

    Don't get me wrong. I'm a fan of what is probably now an old-fashioned
    idea of there being an instruction manual and a reference manual. It's
    just that IMO the reference manual should be for programmers rather than
    for compiler writers.

    Whether a reference manual could be good enough for compiler writers as
    well as programmers is, for me, an open question.

    ...

        Sadly, since then, few people have even tried to define
    languages formally.

    That's surprising. I would have thought that all languages needed a
    formal definition.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to All on Sat Oct 23 22:18:38 2021
    On 23/10/2021 16:06, James Harris wrote:
    [I wrote:]
         You seem to be confusing a language specification with a
    language primer or even advanced text.  The specification must be
    sufficiently precise that no two civilised readers can disagree
    about what it means;  ordinary programmers who find that hard to
    understand can look instead at books that explain what is going
    on.  You don't learn C from N1570;  but you might write compilers
    from it.
    Don't get me wrong. I'm a fan of what is probably now an
    old-fashioned idea of there being an instruction manual and a
    reference manual. It's just that IMO the reference manual should be
    for programmers rather than for compiler writers.

    I didn't say "reference manual", but "specification". If a
    language is not precisely defined, then different compilers will
    either disagree despite conforming to the spec, or else [and just
    as bad] there will be a consensus among compilers that is merely
    "folklore" about what the language means and that cannot be found
    out by programmers who want/need to know. It's that definition
    that compiler writers need to have access to.

    Whether a reference manual could be good enough for compiler writers
    as well as programmers is, for me, an open question.

    If the reference manual is /not/ good enough, then a better
    one is needed, otherwise what are compiler writers supposed to use?
    I don't see why that's "open", other than for toy/private languages.
    IOW, you can do what you like with your own language, but languages
    like C, Fortran, Algol, ... that aspire to a degree of universality
    need a reliable definition. [To some extent, "C is what runs on
    DMR's computer" is such a definition, but it leaves ambiguous
    whether some features are as they are because DMR made arbitrary
    choices for his own compiler or because they are the essence of C.]

         Sadly, since then, few people have even tried to define
    languages formally.
    That's surprising. I would have thought that all languages needed a
    formal definition.

    Well, as above, toy/private languages don't; but yes, it
    should be a "sine qua non" of serious languages that there be a
    formal definition. Up to 1968, that was the way we were headed;
    but reaction to Algol 68 was sufficiently negative in this area
    [unfairly] that no major language since has [AFAIK] gone down a
    similar route. The C and C++ standards are about as close as we
    have come. VDL was touted for this purpose at one stage, but it
    never really caught on, and in any case it's nowhere near as
    precise as a two-level grammar*. Perhaps worth noting that a
    formal definition need not imply that everything is tied down
    with all the t's and i's crossed and dotted; rather, that the
    definition is clear about what is left to the implementation
    [eg, the order of evaluation of operands].

    ____
    * Other formal techniques are available.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Boccherini

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Andy Walker on Mon Oct 25 14:22:02 2021
    On 23/10/2021 22:18, Andy Walker wrote:
    On 23/10/2021 16:06, James Harris wrote:
    [I wrote:]
         You seem to be confusing a language specification with a
    language primer or even advanced text.  The specification must be
    sufficiently precise that no two civilised readers can disagree
    about what it means;  ordinary programmers who find that hard to
    understand can look instead at books that explain what is going
    on.  You don't learn C from N1570;  but you might write compilers
    from it.
    Don't get me wrong. I'm a fan of what is probably now an
    old-fashioned idea of there being an instruction manual and a
    reference manual. It's just that IMO the reference manual should be
    for programmers rather than for compiler writers.

        I didn't say "reference manual", but "specification".

    I know. I brought up the topic of a reference manual.

    If a
    language is not precisely defined, then different compilers will
    either disagree despite conforming to the spec, or else [and just
    as bad] there will be a consensus among compilers that is merely
    "folklore" about what the language means and that cannot be found
    out by programmers who want/need to know.  It's that definition
    that compiler writers need to have access to.

    Whether a reference manual could be good enough for compiler writers
    as well as programmers is, for me, an open question.

        If the reference manual is /not/ good enough, then a better
    one is needed, otherwise what are compiler writers supposed to use?

    Are you arguing for there being a reference manual and a specification,
    or just one text which is good enough for both purposes (programmer's
    reference and compiler-writer's specification)? By 'one text' I don't
    mean one book with both sections but a single exposition which serves
    both purposes.

    One thing I am convinced of is that it helps a human if a text follows
    the pattern: principle, examples and maybe rationale.


    I don't see why that's "open", other than for toy/private languages.
    IOW, you can do what you like with your own language, but languages
    like C, Fortran, Algol, ... that aspire to a degree of universality
    need a reliable definition.  [To some extent, "C is what runs on
    DMR's computer" is such a definition, but it leaves ambiguous
    whether some features are as they are because DMR made arbitrary
    choices for his own compiler or because they are the essence of C.]

    Have to say the language specifications I've seen are not easy to read.
    I do wonder if the same information could be written in an easier form.
    AISI a compiler writer wants to know what's required at each point in
    his code but I am not sure that specifications are written that way.
    I've never tried to write a compiler from a spec but I get the
    impression that to write any piece of compiler code one would have to
    take into account points from across the specification rather than those
    in the one relevant place. That's not much help to a compiler writer.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to James Harris on Mon Oct 25 21:37:00 2021
    On 25/10/2021 14:22, James Harris wrote:
    [...] It's just that IMO the reference manual should be
    for programmers rather than for compiler writers.
         I didn't say "reference manual", but "specification".
    I know. I brought up the topic of a reference manual.

    Yes, but the point is that [for a professional/portable
    language] there absolutely needs to be a specification. If that
    can also serve as a reference manual [it often can], so much the
    better. But, eg, a reference manual might describe only some
    particular implementation, which might be enough for users of
    that implementation but not for compiler writers for different
    implementations.

    [...]
    Whether a reference manual could be good enough for compiler writers
    as well as programmers is, for me, an open question.
         If the reference manual is /not/ good enough, then a better
    one is needed, otherwise what are compiler writers supposed to use?
    Are you arguing for there being a reference manual and a
    specification, or just one text which is good enough for both
    purposes (programmer's reference and compiler-writer's
    specification)? By 'one text' I don't mean one book with both
    sections but a single exposition which serves both purposes.

    Well, there is in general no compelling reason why a
    specification should not also be a reference manual. Perhaps
    worth noting, though, that if I want to know what some code
    does, and don't understand what the reference manual says, I
    can run the code and find out; the compiler writer has no
    such recourse.

    One thing I am convinced of is that it helps a human if a text
    follows the pattern: principle, examples and maybe rationale.

    Sure. You're simply saying that some textbooks are
    not very good; about which there is no dispute.

    Have to say the language specifications I've seen are not easy to
    read. I do wonder if the same information could be written in an
    easier form.

    Very possibly. But a compiler writer has to put some
    effort in. If you want to write [eg] C, you only need to know
    how to write the bits you actually want to use; if you want to
    write a C compiler, you need to understand the whole language.
    [As per my PP, if you want to write your own language, all bets
    are off; I'm talking about manuals and compilers for languages
    intended for widespread use.]

    AISI a compiler writer wants to know what's required at
    each point in his code but I am not sure that specifications are
    written that way. I've never tried to write a compiler from a spec
    but I get the impression that to write any piece of compiler code one
    would have to take into account points from across the specification
    rather than those in the one relevant place. That's not much help to
    a compiler writer.

    If you want to write a C compiler, then you ultimately
    need to understand the whole of the C standard. The C standard
    is not particularly hard to read. OTOH, it is also not as good
    as a definition of C as it really ought to be -- witness the
    discussions in "c.s.c". For C, substitute any major language
    of your choice.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Soler

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)