• Re: Type qualifiers, declaration aliases and namespaces

    From James Harris@21:1/5 to David Brown on Fri Nov 26 18:41:45 2021
    On 21/08/2021 19:11, David Brown wrote:
    On 21/08/2021 11:32, James Harris wrote:
    On 20/08/2021 19:50, David Brown wrote:

    ...

    I'd recommend looking at C++ templates.  You might not want to follow
    all the details of the syntax, and you want to look at the newer and
    better techniques rather than the old ones.  But pick a way to give
    compile-time parameters to types, and then use that - don't faff around
    with special cases and limited options.  Pick one good method, then you >>> could have something like this :

        builtin::int<32> x;
        using int32 = builtin::int<32>;
        int32 y;

    That is (IMHO) much better than your version because it will be
    unambiguous, flexible, and follows a syntax that you can use for all
    sorts of features.

    My version of that would be

      typedef i32 = int 32

      int 32 x
      i32 y


    Punctuation here is not /necessary/, but it would make the code far
    easier to read, and far safer (in that mistakes are more likely to be
    seen by the compiler rather than being valid code with unintended meaning).

    Noted.


    Your C++ version doesn't seem to be any more precise or flexible.

    What happens when you have a type that should have two parameters - size
    and alignment, for example? Or additional non-integer parameters such
    as signedness or overflow behaviour? Or for container types with other
    types as parameters? C++ has that all covered in a clear and accurate
    manner - your system does not.

    That's not wholly true. Specific terms and syntax are not yet decided
    but I do have the concept of qualifiers. For example,

    int 32 x
    int 32 alignbits 3 y

    In that, y would be required to be aligned such that the bottom 3 bits
    of its address were zero.

    However, the syntax is not yet chosen and if as you suggest the use of punctuation would not be onerous I would prefer the addition of the
    colon as in

    int 32: x
    int 32 alignbits 3: y

    The additional colon would make parsing by compiler and by human easier.
    I have omitted it up until now as I could imagine that programmers would
    not want to have to type it in simple declarations such as

    int: i

    but maybe that doesn't look too bad.


    My intention here is to encourage you to think bigger. Stop thinking
    "how do I make integer types?" - think wider and with greater generality
    and ambition. Make a good general, flexible system of types, and then
    let your integer types fall naturally out of that.

    The goal is that range specifications would apply anywhere relevant, not
    just for integers. For example,

    array (5..15) floati32: v

    would declare an array of between 5 and 15 elements of type floati32.
    One might use that as a parameter declaration to require that what gets
    passed in has to be an array which matches within certain size limits.

    I take your point on board but I don't know that my syntax can be made
    more general without getting in to either variants or generics.

    ...

    Perhaps even look at the metaclasses proposal <https://www.fluentcpp.com/2018/03/09/c-metaclasses-proposal-less-5-minutes/>.
    This will not be in C++ before C++26, maybe even later, but it gives a whole new way of building code. If metaclasses had been part of C++
    from the beginning, there would be no struct, class, enum, or union in
    the language - these would have been standard library metaclasses. They
    are /that/ flexible.

    I've been looking at some material by one of the proposers, Herb Sutter.
    A proper discussion about such things needs a topic of its own but I
    should say here that my fundamental reaction is not favourable. He
    effectively acknowledges that the more such proposals complicate a
    language the more a programmer has to depend on tools to help understand program code and that's a step too far for me at present.

    One of Sutter's justifications for metaclasses is that they help remove
    a lot of boilerplate from C++ code and, on that, ISTM that the problem
    may be the presence of the boilerplate in the first place. I don't know
    enough C++ but ATM I am not sure I'd have the boilerplate to remove.

    As a wider point, I've previously seen Python become more 'clever'. And
    it has become less comprehensible as a result. C++ is already alarmingly complex. Sutter's proposals would make it more so. In both cases ISTM
    that people who are unhealthily immersed in the language (Python, C++)
    see ways to be even cleverer and that makes a language worse. I believe
    similar has happened with Ada and JavaScript. C, by contrast, despite
    some enhancements has remained pleasantly small and focussed.

    I'll keep metaclasses in mind but ATM they are in the bucket with other
    ideas for code customisation such as those you list here:

    Ultimately, things like macros, templates, generics, metafunctions,
    etc., are just names for high-level compile-time coding constructs.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Sat Nov 27 16:17:41 2021
    On 26/11/2021 19:41, James Harris wrote:
    On 21/08/2021 19:11, David Brown wrote:
    On 21/08/2021 11:32, James Harris wrote:
    On 20/08/2021 19:50, David Brown wrote:

    ...

    I'd recommend looking at C++ templates.  You might not want to follow >>>> all the details of the syntax, and you want to look at the newer and
    better techniques rather than the old ones.  But pick a way to give
    compile-time parameters to types, and then use that - don't faff around >>>> with special cases and limited options.  Pick one good method, then you >>>> could have something like this :

         builtin::int<32> x;
         using int32 = builtin::int<32>;
         int32 y;

    That is (IMHO) much better than your version because it will be
    unambiguous, flexible, and follows a syntax that you can use for all
    sorts of features.

    My version of that would be

       typedef i32 = int 32

       int 32 x
       i32 y


    Punctuation here is not /necessary/, but it would make the code far
    easier to read, and far safer (in that mistakes are more likely to be
    seen by the compiler rather than being valid code with unintended
    meaning).

    Noted.


    Your C++ version doesn't seem to be any more precise or flexible.

    What happens when you have a type that should have two parameters - size
    and alignment, for example?  Or additional non-integer parameters such
    as signedness or overflow behaviour?  Or for container types with other
    types as parameters?  C++ has that all covered in a clear and accurate
    manner - your system does not.

    That's not wholly true. Specific terms and syntax are not yet decided
    but I do have the concept of qualifiers. For example,

      int 32 x
      int 32 alignbits 3 y

    In that, y would be required to be aligned such that the bottom 3 bits
    of its address were zero.


    Before you try to think out a syntax here, ask yourself /why/ someone
    would want this feature. What use is it? What are the circumstances
    when you might need non-standard alignment? What are the consequences
    of it? If this is something that is only very rarely useful (and I
    believe that is the case here - but /you/ have to figure that out for
    your language), there is no point going out of your way to make it easy
    to write. Common things should be easy to write - rare things can be
    hard to write. What you certainly don't want is a short but cryptic way
    to write it.

    So for me, your "alignbits 3" is just wrong - it makes no sense. You
    are trying to say it should be aligned with 8-byte alignment, also known
    as 64-bit alignment. Obviously I can figure out what you meant - there
    really isn't any other possibility for "alignbits 3". But if you had
    written "alignbits 8", I would take that to mean a packed or unaligned declaration, not one with 256-byte alignment.

    However, the syntax is not yet chosen and if as you suggest the use of punctuation would not be onerous I would prefer the addition of the
    colon as in

      int 32: x
      int 32 alignbits 3: y


    The details here are a matter of taste, but you get my point about
    improving readability.

    The additional colon would make parsing by compiler and by human easier.
    I have omitted it up until now as I could imagine that programmers would
    not want to have to type it in simple declarations such as

      int: i

    but maybe that doesn't look too bad.


    I only know one person who regularly complains about having to use
    punctuation and finds it inconvenient to type symbols. But even he uses punctuation at times in his languages.

    (On the other hand, too much punctuation makes code harder to read and
    write. As with most things, you want a happy medium.)


    My intention here is to encourage you to think bigger.  Stop thinking
    "how do I make integer types?" - think wider and with greater generality
    and ambition.  Make a good general, flexible system of types, and then
    let your integer types fall naturally out of that.

    The goal is that range specifications would apply anywhere relevant, not
    just for integers. For example,

      array (5..15) floati32: v

    would declare an array of between 5 and 15 elements of type floati32.
    One might use that as a parameter declaration to require that what gets passed in has to be an array which matches within certain size limits.

    I take your point on board but I don't know that my syntax can be made
    more general without getting in to either variants or generics.


    Unless a language was just a simple, limited scripting tool, I would not
    bother making (or learning) a new language that did not have features
    such as variants or generics (noting that these terms are vague and mean different things to different people). There are perfectly good
    languages without such features. Given the vast benefits of C in terms
    of existing implementation, code, experience and information, why would
    anyone bother with a different compiled language unless it let them do
    things that you cannot easily do in C? Being able to make your own
    types, with their rules, invariants, methods, operators, etc., is pretty
    much a basic level feature for modern languages. Generic programming is standard. I would no longer consider these as advanced or complex
    features of a modern language, I'd consider them foundational.

    Note that I am /not/ saying you should copy C++'s templates, or Ada's
    classes. Your best plan is to learn from these languages - see what
    they can do. And then find a better, nicer, clearer and simpler way to
    get the same (or more) power. When you are starting a new language, you
    don't have to keep compatibility and build step by step over many years,
    you can jump straight to a better syntax.

    ...

    Perhaps even look at the metaclasses proposal
    <https://www.fluentcpp.com/2018/03/09/c-metaclasses-proposal-less-5-minutes/>.

      This will not be in C++ before C++26, maybe even later, but it gives a
    whole new way of building code.  If metaclasses had been part of C++
    from the beginning, there would be no struct, class, enum, or union in
    the language - these would have been standard library metaclasses.  They
    are /that/ flexible.

    I've been looking at some material by one of the proposers, Herb Sutter.
    A proper discussion about such things needs a topic of its own but I
    should say here that my fundamental reaction is not favourable. He effectively acknowledges that the more such proposals complicate a
    language the more a programmer has to depend on tools to help understand program code and that's a step too far for me at present.

    Remember that metaclasses are not for the "average" programmer. They
    are for the library builders and the language builders. A good
    proportion of modern C++ features are never seen or used by the majority
    of programmers, but they are used underneath to implement the features
    that /are/ used. Few C++ programmers really understand rvalue
    references and move semantics, but they are happy to see that the
    standard library container classes are now more efficient - without
    caring about the underlying language changes that allow those efficiency
    gains. Probably something like 99% of Python programmers have never
    even heard of metaclasses, yet they use libraries that make use of them.


    One of Sutter's justifications for metaclasses is that they help remove
    a lot of boilerplate from C++ code and, on that, ISTM that the problem
    may be the presence of the boilerplate in the first place. I don't know enough C++ but ATM I am not sure I'd have the boilerplate to remove.

    When starting with a language from scratch, you can avoid a fair amount
    of boilerplate that is necessarily when features have evolved over time.
    But your language will either develop idioms that need boilerplate, or
    it will die out because no one uses it. (There are only two kinds of programming languages - the ones that people complain about, and the
    ones no one uses.)

    Metaprogramming and metaclasses do not /remove/ boilerplate code - they
    push it one level higher, so that fewer people need to make the
    boilerplate code and they need to make less of it.

    (Again, I am not saying that you should copy C++'s way of doing things,
    or Stutter's proposals here - just that you could learn from it and be
    inspired by it.)


    As a wider point, I've previously seen Python become more 'clever'. And
    it has become less comprehensible as a result. C++ is already alarmingly complex. Sutter's proposals would make it more so. In both cases ISTM
    that people who are unhealthily immersed in the language (Python, C++)
    see ways to be even cleverer and that makes a language worse. I believe similar has happened with Ada and JavaScript. C, by contrast, despite
    some enhancements has remained pleasantly small and focussed.


    C basically has not changed - the new features since C99 have been quite
    minor.

    I'll keep metaclasses in mind but ATM they are in the bucket with other
    ideas for code customisation such as those you list here:

    Ultimately, things like macros, templates, generics, metafunctions,
    etc., are just names for high-level compile-time coding constructs.



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Sat Nov 27 18:55:39 2021
    On 27/11/2021 15:17, David Brown wrote:
    On 26/11/2021 19:41, James Harris wrote:

    That's not wholly true. Specific terms and syntax are not yet decided
    but I do have the concept of qualifiers. For example,

      int 32 x
      int 32 alignbits 3 y

    In that, y would be required to be aligned such that the bottom 3 bits
    of its address were zero.


    Before you try to think out a syntax here, ask yourself /why/ someone
    would want this feature. What use is it? What are the circumstances
    when you might need non-standard alignment? What are the consequences
    of it? If this is something that is only very rarely useful (and I
    believe that is the case here - but /you/ have to figure that out for
    your language), there is no point going out of your way to make it easy
    to write. Common things should be easy to write - rare things can be
    hard to write.

    Yet C breaks that rule all the time. Just today I needed to type:

    unsigned char to mean byte or u8
    unsigned long lont int to mean u64 [typo left in]
    printf("....\n", ...) to mean println ...

    Yes you could use and uint8_t uint64_t, but that still needs:

    #include <stdint.h>

    to be remembered to add at the top of every module


    So for me, your "alignbits 3" is just wrong - it makes no sense. You
    are trying to say it should be aligned with 8-byte alignment, also known
    as 64-bit alignment. Obviously I can figure out what you meant - there really isn't any other possibility for "alignbits 3". But if you had
    written "alignbits 8", I would take that to mean a packed or unaligned declaration, not one with 256-byte alignment.

    I don't get it either, but I guess you're not complaining of a way to
    control alignment of type, just that this not intuitive?

    In my assembler I use:

    align N

    to force alignment of next data/code byte at a multiple of N bytes,
    usually a power-of-two.

    My HLL doesn't have that, except that I once used @@ N to control the
    alignment of record fields (now I use a $caligned attribute for the
    whole record as that was only use for @@, to emulate C struct layout).

    The additional colon would make parsing by compiler and by human easier.
    I have omitted it up until now as I could imagine that programmers would
    not want to have to type it in simple declarations such as

      int: i

    but maybe that doesn't look too bad.


    I only know one person who regularly complains about having to use punctuation and finds it inconvenient to type symbols. But even he uses punctuation at times in his languages.

    Shifted punctuation is worse.

    (On the other hand, too much punctuation makes code harder to read and
    write. As with most things, you want a happy medium.)


    My intention here is to encourage you to think bigger.  Stop thinking
    "how do I make integer types?" - think wider and with greater generality >>> and ambition.  Make a good general, flexible system of types, and then
    let your integer types fall naturally out of that.

    The goal is that range specifications would apply anywhere relevant, not
    just for integers. For example,

      array (5..15) floati32: v

    would declare an array of between 5 and 15 elements of type floati32.

    (No. That's just not what anyone would guess that to mean. It looks like
    an array of length 11 indexed from 5 to 15 inclusive.

    It's not clear what the purpose of this is, or what a compiler is
    supposed to do with that info.)

    Unless a language was just a simple, limited scripting tool, I would not bother making (or learning) a new language that did not have features
    such as variants or generics (noting that these terms are vague and mean different things to different people). There are perfectly good
    languages without such features. Given the vast benefits of C in terms
    of existing implementation, code, experience and information, why would anyone bother with a different compiled language unless it let them do
    things that you cannot easily do in C? Being able to make your own
    types, with their rules, invariants, methods, operators, etc., is pretty
    much a basic level feature for modern languages. Generic programming is standard. I would no longer consider these as advanced or complex
    features of a modern language, I'd consider them foundational.

    That still leaves a big gap between C, and a language with all those
    advanced features, which probably cannot offer the benefits of small
    footprint, transparency, and the potential for a fast build process.

    Plus there are plenty of things at the level of C that some people (me,
    for a start) want but it cannot offer:

    * An alternative to that god-forsaken, error prone syntax
    * Freedom from case-sensitivity
    * 1-based arrays!
    * An ACTUAL byte/u8 type without all the behind-the-scenes
    nonsense, and the need for stdint/inttypes etc
    * 64-bit integer types as standard
    * A grown-up Print feature
    * etc etc

    What /are/ the actual alternatives available as the next C replacement;
    Rust and Zig? You're welcome to them!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Bart on Sun Nov 28 10:11:21 2021
    On 27/11/2021 18:55, Bart wrote:
    On 27/11/2021 15:17, David Brown wrote:
    On 26/11/2021 19:41, James Harris wrote:

    ...

    My intention here is to encourage you to think bigger.  Stop thinking >>>> "how do I make integer types?" - think wider and with greater
    generality
    and ambition.  Make a good general, flexible system of types, and then >>>> let your integer types fall naturally out of that.

    The goal is that range specifications would apply anywhere relevant, not >>> just for integers. For example,

       array (5..15) floati32: v

    would declare an array of between 5 and 15 elements of type floati32.

    (No. That's just not what anyone would guess that to mean. It looks like
    an array of length 11 indexed from 5 to 15 inclusive.

    It's not clear what the purpose of this is, or what a compiler is
    supposed to do with that info.)

    It's not meant to be a feature. It's the consequence of trying to be consistent: allowing the parameters of parameters (if you see what I
    mean) to be qualified whether the parameters are integers or arrays or
    whatever else. I was pointing out to David that I didn't have a special
    syntax just for integers.

    I may eventually limit what a programmer could do (for
    comprehensibility, perhaps!) but for now ISTM best to keep features
    orthogonal and universal, even if the combination thereof looks strange
    at first.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Sun Nov 28 09:33:14 2021
    On 27/11/2021 15:17, David Brown wrote:
    On 26/11/2021 19:41, James Harris wrote:
    On 21/08/2021 19:11, David Brown wrote:

    ...

    What happens when you have a type that should have two parameters - size >>> and alignment, for example?  Or additional non-integer parameters such
    as signedness or overflow behaviour?  Or for container types with other >>> types as parameters?  C++ has that all covered in a clear and accurate
    manner - your system does not.

    That's not wholly true. Specific terms and syntax are not yet decided
    but I do have the concept of qualifiers. For example,

      int 32 x
      int 32 alignbits 3 y

    In that, y would be required to be aligned such that the bottom 3 bits
    of its address were zero.


    Before you try to think out a syntax here, ask yourself /why/ someone
    would want this feature. What use is it? What are the circumstances
    when you might need non-standard alignment? What are the consequences
    of it? If this is something that is only very rarely useful (and I
    believe that is the case here - but /you/ have to figure that out for
    your language), there is no point going out of your way to make it easy
    to write. Common things should be easy to write - rare things can be
    hard to write. What you certainly don't want is a short but cryptic way
    to write it.

    Agreed, and it may change. It's just that for now I have qualifiers
    after the base type and if there were a need to align a type then that's
    where that particular qualifier would be put. In practice, alignment is
    more likely to apply to structures/records than to integers.


    So for me, your "alignbits 3" is just wrong - it makes no sense. You
    are trying to say it should be aligned with 8-byte alignment, also known
    as 64-bit alignment. Obviously I can figure out what you meant - there really isn't any other possibility for "alignbits 3". But if you had
    written "alignbits 8", I would take that to mean a packed or unaligned declaration, not one with 256-byte alignment.

    On that, I wonder if I could persuade you to think in terms of the
    number of bits. AISI there are two ways one can specify alignment: a
    power of two number of bytes that the alignment has to be a multiple of
    and the number of zero bits on the RHS.

    When specifying constants it's easier to begin with and convert from the
    number of bits. Consider the opposite. Given

    constant ALIGN_BYTES = 8

    there are these two ways one might convert that to alignment bits.

    constant ALIGN_BITS = Log2RoundedUp(ALIGN_BYTES)
    constant ALIGN_BITS = Log2ButErrorIfNotPowerOfTwo(ALIGN_BYTES)

    IOW (1) there are two possible interpretations of the conversion and,
    perhaps worse, (2) either would need a special function to implement it.

    By contrast, if we begin with alignment bits then there's a standard
    conversion which needs no special function.

    constant ALIGN_BYTES = 1 << ALIGN_BITS

    Hence I prefer to use bit alignment (number of zero bits on RHS) as the
    base constant. Other constants and values can easily be derived from there.

    ...

      int 32 alignbits 3: y

    ...

    (On the other hand, too much punctuation makes code harder to read and
    write. As with most things, you want a happy medium.)

    Yes. I could require

    int(32)<alignbits 3> y

    which would perhaps be more convenient for the compiler (and more
    familiar for a new reader) but in the long term I suspect it would be
    more work for a human to write and read.

    ...

    Remember that metaclasses are not for the "average" programmer. They
    are for the library builders and the language builders. A good
    proportion of modern C++ features are never seen or used by the majority
    of programmers, but they are used underneath to implement the features
    that /are/ used. Few C++ programmers really understand rvalue
    references and move semantics, but they are happy to see that the
    standard library container classes are now more efficient - without
    caring about the underlying language changes that allow those efficiency gains. Probably something like 99% of Python programmers have never
    even heard of metaclasses, yet they use libraries that make use of them.

    When it comes to language design I have a problem with the conceptual
    division of programmers into average and expert. The issue is that it
    assumes that each can write their own programs and the two don't mix. In reality, code is about communication. I see a programming language as a
    lingua franca. Part of its value is that everyone can understand it. If
    the 'experts' start writing code which the rest of the world cannot
    decipher then a significant part of that value is lost.

    Hence, AISI, it's better for a language to avoid special features for
    experts, if possible.

    ...

    it will die out because no one uses it. (There are only two kinds of programming languages - the ones that people complain about, and the
    ones no one uses.)

    :-)


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Sun Nov 28 14:46:22 2021
    On 28/11/2021 10:33, James Harris wrote:
    On 27/11/2021 15:17, David Brown wrote:
    On 26/11/2021 19:41, James Harris wrote:
    On 21/08/2021 19:11, David Brown wrote:

    ...



    So for me, your "alignbits 3" is just wrong - it makes no sense.  You
    are trying to say it should be aligned with 8-byte alignment, also known
    as 64-bit alignment.  Obviously I can figure out what you meant - there
    really isn't any other possibility for "alignbits 3".  But if you had
    written "alignbits 8", I would take that to mean a packed or unaligned
    declaration, not one with 256-byte alignment.

    On that, I wonder if I could persuade you to think in terms of the
    number of bits. AISI there are two ways one can specify alignment: a
    power of two number of bytes that the alignment has to be a multiple of
    and the number of zero bits on the RHS.

    Those are equivalent in terms of the actual implementation, but not in
    the way a programmer is likely to think (or want to think). The whole
    point of a programming language above the level of assembly is that the programmer doesn't think in terms of underlying representations in bits
    and bytes, but at a higher level, in terms of values and the meanings of
    the values. If I write "int * p = &x;", I think of "p" as a pointer to
    the variable "x". I don't think about whether it is 64-bit or 32-bit,
    or whether it is an absolute address or relative to a base pointer, or
    how it is translated via page tables. Considering the number of zero
    bits in the representation of the address is at a completely different
    level of abstraction from what I would see as relevant in a programming language.


    When specifying constants it's easier to begin with and convert from the number of bits. Consider the opposite. Given

      constant ALIGN_BYTES = 8

    there are these two ways one might convert that to alignment bits.

      constant ALIGN_BITS = Log2RoundedUp(ALIGN_BYTES)
      constant ALIGN_BITS = Log2ButErrorIfNotPowerOfTwo(ALIGN_BYTES)

    IOW (1) there are two possible interpretations of the conversion and,
    perhaps worse, (2) either would need a special function to implement it.


    This is all completely trivial to implement in your
    compiler/interpreter. Users are not interested in the number of zero
    bits in addresses - and they are not interested in the effort it takes
    to implement a feature. If you want a programming language that is more
    than a toy, a learning experiment, or a one-man show, then you must
    prioritise the effort of the user by many orders of magnitude over the convenience of the implementer.

    By contrast, if we begin with alignment bits then there's a standard conversion which needs no special function.

      constant ALIGN_BYTES = 1 << ALIGN_BITS

    Hence I prefer to use bit alignment (number of zero bits on RHS) as the
    base constant. Other constants and values can easily be derived from there.

    ...

       int 32 alignbits 3: y

    ...

    (On the other hand, too much punctuation makes code harder to read and
    write.  As with most things, you want a happy medium.)

    Yes. I could require

      int(32)<alignbits 3> y

    which would perhaps be more convenient for the compiler (and more
    familiar for a new reader) but in the long term I suspect it would be
    more work for a human to write and read.

    ...

    Remember that metaclasses are not for the "average" programmer.  They
    are for the library builders and the language builders.  A good
    proportion of modern C++ features are never seen or used by the majority
    of programmers, but they are used underneath to implement the features
    that /are/ used.  Few C++ programmers really understand rvalue
    references and move semantics, but they are happy to see that the
    standard library container classes are now more efficient - without
    caring about the underlying language changes that allow those efficiency
    gains.  Probably something like 99% of Python programmers have never
    even heard of metaclasses, yet they use libraries that make use of them.

    When it comes to language design I have a problem with the conceptual division of programmers into average and expert. The issue is that it
    assumes that each can write their own programs and the two don't mix.

    That would be an incorrect assumption.

    Prioritise readability over writeability - you write a piece of code
    once, but it can be read many times. It is entirely to be expected that
    there is code that people will read and understand, but know they could
    not have written it themselves.

    There is always going to be a huge spread between beginners (including
    those that never get beyond beginner stages no matter how long they
    spend), average and expert programmers. This is perhaps an unusual
    aspect of programming as a profession and hobby. Imagine there were
    such a spread amongst professional football ("soccer", for those living
    in the ex-colonies) players. On the same team as Maradonna you'd have
    someone who insists on picking up and carrying the ball, since it
    clearly works, and someone who could be outrun by an asthmatic snail.

    Unless you are designing a language to compete with Bart for the record
    of fewest users, expect this difference in competences. Embrace it and
    make use of it, rather than futilely fighting it.

    In
    reality, code is about communication. I see a programming language as a lingua franca. Part of its value is that everyone can understand it. If
    the 'experts' start writing code which the rest of the world cannot
    decipher then a significant part of that value is lost.

    Hence, AISI, it's better for a language to avoid special features for experts, if possible.

    ...

    it will die out because no one uses it.  (There are only two kinds of
    programming languages - the ones that people complain about, and the
    ones no one uses.)

    :-)



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Sun Nov 28 14:24:38 2021
    On 27/11/2021 19:55, Bart wrote:
    On 27/11/2021 15:17, David Brown wrote:
    On 26/11/2021 19:41, James Harris wrote:

    That's not wholly true. Specific terms and syntax are not yet decided
    but I do have the concept of qualifiers. For example,

       int 32 x
       int 32 alignbits 3 y

    In that, y would be required to be aligned such that the bottom 3 bits
    of its address were zero.


    Before you try to think out a syntax here, ask yourself /why/ someone
    would want this feature.  What use is it?  What are the circumstances
    when you might need non-standard alignment?  What are the consequences
    of it?  If this is something that is only very rarely useful (and I
    believe that is the case here - but /you/ have to figure that out for
    your language), there is no point going out of your way to make it easy
    to write.  Common things should be easy to write - rare things can be
    hard to write.

    Yet C breaks that rule all the time. Just today I needed to type:

       unsigned char            to mean byte or u8
       unsigned long lont int   to mean u64 [typo left in]
       printf("....\n", ...)    to mean println ...

    If you needed to type those to mean something else, that is /purely/ a
    problem with the programmer, not with the language. The language and
    its standard libraries provides everything you need in order to write
    these things in the way you want. As long as the language makes that practically possible (and in the first two examples at least, extremely simple), the language does all it needs.

    No language will /ever/ mean you can trivially write all the code you
    want to write! Obviously when you are doing a one-man language with one designer, one user, and one type of code, and are happy to modify the
    language to suit the program you are writing, you can come quite close.
    But for real languages developed and implemented by large numbers of
    people and used by huge numbers of people, that does not happen.

    As so often happens, in your manic obsession to rail against C, you
    completely missed the point. Oh, and you also missed that in this
    newsgroup I have repeatedly said that people should study popular and successful languages like C and C++ (and others) in order to learn from
    them and take inspiration from them, and to aim to make something
    /better/ for their particular purposes and requirements.


    Yes you could use and uint8_t uint64_t, but that still needs:

       #include <stdint.h>

    to be remembered to add at the top of every module


    Do you have any idea how pathetic and childish that sounds? Presumably
    not, or you wouldn't have written it. So let me inform you, again, that continually whining and crying about totally insignificant
    inconveniences does nothing to help your "down with C" campaign.


    So for me, your "alignbits 3" is just wrong - it makes no sense.  You
    are trying to say it should be aligned with 8-byte alignment, also known
    as 64-bit alignment.  Obviously I can figure out what you meant - there
    really isn't any other possibility for "alignbits 3".  But if you had
    written "alignbits 8", I would take that to mean a packed or unaligned
    declaration, not one with 256-byte alignment.

    I don't get it either, but I guess you're not complaining of a way to
    control alignment of type, just that this not intuitive?


    Yes. There are occasions when controlling alignment can be important,
    but they are really quite rare in practice. The most common cases I see
    in my line of work are use of "packed" structures to go lower than
    standard alignments, and the majority (but not all) of such cases I see
    are counter-productive and a really bad idea. On bigger systems,
    picking higher alignments can sometimes be helpful for controlling the efficiency of caching.

    So being able to control alignment is a good feature of a relatively
    low-level language. But it is something that you only need on rare
    occasions, so it doesn't have to be a simple or convenient thing to
    write and it /does/ have to be something that is obvious to understand
    when you see it written. "alignbits 3" does not qualify, IMHO.

    In my assembler I use:

        align N

    to force alignment of next data/code byte at a multiple of N bytes,
    usually a power-of-two.


    That's a perfectly reasonable choice - matching pretty much ever
    assembler I've ever used (baring details such as "align" vs. ".align").

    My HLL doesn't have that, except that I once used @@ N to control the alignment of record fields (now I use a $caligned attribute for the
    whole record as that was only use for @@, to emulate C struct layout).


    Using "@@" for the purpose is an example of cryptic syntax that is
    unhelpful for a feature that is rarely needed. A word such as a
    "caligned" attribute will be clearer when people read the code.


    The additional colon would make parsing by compiler and by human easier. >>> I have omitted it up until now as I could imagine that programmers would >>> not want to have to type it in simple declarations such as

       int: i

    but maybe that doesn't look too bad.


    I only know one person who regularly complains about having to use
    punctuation and finds it inconvenient to type symbols.  But even he uses
    punctuation at times in his languages.

    Shifted punctuation is worse.


    So you'd rather write "x - -y" than "x + y", because it avoids the shift
    key? That seems like a somewhat questionable choice of priorities.

    (On the other hand, too much punctuation makes code harder to read and
    write.  As with most things, you want a happy medium.)


    My intention here is to encourage you to think bigger.  Stop thinking >>>> "how do I make integer types?" - think wider and with greater
    generality
    and ambition.  Make a good general, flexible system of types, and then >>>> let your integer types fall naturally out of that.

    The goal is that range specifications would apply anywhere relevant, not >>> just for integers. For example,

       array (5..15) floati32: v

    would declare an array of between 5 and 15 elements of type floati32.

    (No. That's just not what anyone would guess that to mean. It looks like
    an array of length 11 indexed from 5 to 15 inclusive.

    It's not clear what the purpose of this is, or what a compiler is
    supposed to do with that info.)

    Unless a language was just a simple, limited scripting tool, I would not
    bother making (or learning) a new language that did not have features
    such as variants or generics (noting that these terms are vague and mean
    different things to different people).  There are perfectly good
    languages without such features.  Given the vast benefits of C in terms
    of existing implementation, code, experience and information, why would
    anyone bother with a different compiled language unless it let them do
    things that you cannot easily do in C?  Being able to make your own
    types, with their rules, invariants, methods, operators, etc., is pretty
    much a basic level feature for modern languages.  Generic programming is
    standard.  I would no longer consider these as advanced or complex
    features of a modern language, I'd consider them foundational.

    That still leaves a big gap between C, and a language with all those
    advanced features, which probably cannot offer the benefits of small footprint, transparency, and the potential for a fast build process.

    Plus there are plenty of things at the level of C that some people (me,
    for a start) want but it cannot offer:

        * An alternative to that god-forsaken, error prone syntax
        * Freedom from case-sensitivity
        * 1-based arrays!
        * An ACTUAL byte/u8 type without all the behind-the-scenes
          nonsense, and the need for stdint/inttypes etc
        * 64-bit integer types as standard
        * A grown-up Print feature
        * etc etc

    What /are/ the actual alternatives available as the next C replacement;
    Rust and Zig? You're welcome to them!


    If there were a significant number of people who wanted a language with
    these features, there would be one.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Sun Nov 28 14:54:04 2021
    On 28/11/2021 13:46, David Brown wrote:
    On 28/11/2021 10:33, James Harris wrote:
    On 27/11/2021 15:17, David Brown wrote:


    So for me, your "alignbits 3" is just wrong - it makes no sense.  You
    are trying to say it should be aligned with 8-byte alignment, also known >>> as 64-bit alignment.  Obviously I can figure out what you meant - there >>> really isn't any other possibility for "alignbits 3".  But if you had
    written "alignbits 8", I would take that to mean a packed or unaligned
    declaration, not one with 256-byte alignment.

    BTW, you may be assuming octet addressing. That's not always the case.


    On that, I wonder if I could persuade you to think in terms of the
    number of bits. AISI there are two ways one can specify alignment: a
    power of two number of bytes that the alignment has to be a multiple of
    and the number of zero bits on the RHS.

    Those are equivalent in terms of the actual implementation, but not in
    the way a programmer is likely to think (or want to think). The whole
    point of a programming language above the level of assembly is that the programmer doesn't think in terms of underlying representations in bits
    and bytes, but at a higher level, in terms of values and the meanings of
    the values.

    That's all very well but if you are thinking about alignment of values
    or structures you are already working at a low level.

    Further, say you had your alignment in the way you prefer, i.e. as a
    number of bytes such as 8. What would you write if you wanted to apply a commensurate shift? To get from 8 to 3 you'd need some sort of
    log-base-2 function of the type I showed earlier. Which would you want a language to provide?

    All in all, I put it to you that going from 3 to 8 is easier. :-)


    If I write "int * p = &x;", I think of "p" as a pointer to
    the variable "x". I don't think about whether it is 64-bit or 32-bit,
    or whether it is an absolute address or relative to a base pointer, or
    how it is translated via page tables. Considering the number of zero
    bits in the representation of the address is at a completely different
    level of abstraction from what I would see as relevant in a programming language.

    Well, if an address is guaranteed to be at a certain alignment then asm programmers may store flags in the lower bits. AFAICS that's not too
    easy to do in C so C programmers don't think in those terms - perhaps
    putting the flags in a separate integer on their own. But there can be
    value in using such bits and some hardware structures already include
    them. Specifying an address as aligned can make low bits available.

    At the end of the day, confining layout details to /declarations/ means
    that the body of an algorithm can still work with the elements which
    make sense for the task in hand, leaving the compiler to handle the
    details of accessing them.




    When specifying constants it's easier to begin with and convert from the
    number of bits. Consider the opposite. Given

      constant ALIGN_BYTES = 8

    there are these two ways one might convert that to alignment bits.

      constant ALIGN_BITS = Log2RoundedUp(ALIGN_BYTES)
      constant ALIGN_BITS = Log2ButErrorIfNotPowerOfTwo(ALIGN_BYTES)

    IOW (1) there are two possible interpretations of the conversion and,
    perhaps worse, (2) either would need a special function to implement it.


    This is all completely trivial to implement in your
    compiler/interpreter. Users are not interested in the number of zero
    bits in addresses - and they are not interested in the effort it takes
    to implement a feature. If you want a programming language that is more
    than a toy, a learning experiment, or a one-man show, then you must prioritise the effort of the user by many orders of magnitude over the convenience of the implementer.

    I was talking about the facilities being made available to programmers,
    not just those I would use internally!


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Sun Nov 28 16:26:12 2021
    On 28/11/2021 13:24, David Brown wrote:
    On 27/11/2021 19:55, Bart wrote:

    Yet C breaks that rule all the time. Just today I needed to type:

       unsigned char            to mean byte or u8
       unsigned long lont int   to mean u64 [typo left in]
       printf("....\n", ...)    to mean println ...

    If you needed to type those to mean something else, that is /purely/ a problem with the programmer, not with the language. The language and
    its standard libraries provides everything you need in order to write
    these things in the way you want. As long as the language makes that practically possible (and in the first two examples at least, extremely simple), the language does all it needs.

    No it doesn't, not in a way I'd consider acceptable.

    You don't get a 'byte' type, which was commonly used 40+ years ago but
    is still missing from the world's most popular lower-level language, by insisting users define it themselves!

    Until C99 that wasn't even possible in a reliable manner. Even now, it
    means mucking about with typedefs and special headers and maybe
    conditional code, but is still full of complications:

    * You still need interact with other people's code that uses unsigned
    char or uint8_t or sometimes plain char

    * You may need to interact with code from other people who have all
    created their alternate solutions (_byte, Byte, _Byte, ubyte, u8 etc).

    To get back to /your/ point, which was:

    "Common things should be easy to write - rare things can be
    hard to write."

    I'm saying that you want express a u8 type as simply as possible:

    byte x;

    and in way that is compatible with anyone else's u8 type who's using the
    same language.

    I managed this in 1981 in my very first language.

    Your comments anyway don't address my printf example. In 1981 I could
    also write:

    print x

    That, you still can't do in C. You can't even do it in C++ (another
    favourite of yours):

    std::cout << x; // or some such thing

    C++ however may provide means to emulate such a feature, which brings us
    back to my comments above.


    No language will /ever/ mean you can trivially write all the code you
    want to write! Obviously when you are doing a one-man language with one designer, one user, and one type of code, and are happy to modify the language to suit the program you are writing, you can come quite close.
    But for real languages developed and implemented by large numbers of
    people and used by huge numbers of people, that does not happen.

    I'm talking about the basics, the things you /commonly/ want to do.


    As so often happens, in your manic obsession to rail against C, you completely missed the point. Oh, and you also missed that in this
    newsgroup I have repeatedly said that people should study popular and successful languages like C and C++ (and others) in order to learn from
    them and take inspiration from them, and to aim to make something
    /better/ for their particular purposes and requirements.

    What do you think /I/ should take away from C?

    I have actually taken some things from it, but it's not many:

    * Use f() for function calls with no arguments, not f
    * Allow f to mean a function pointer, as well as &f
    * Switched to 0xABC for hex literals, not 0ABCH

    Otherwise C has little to teach me about devising system languages.


    Yes you could use and uint8_t uint64_t, but that still needs:

       #include <stdint.h>

    to be remembered to add at the top of every module


    Do you have any idea how pathetic and childish that sounds? Presumably
    not, or you wouldn't have written it. So let me inform you, again, that continually whining and crying about totally insignificant
    inconveniences does nothing to help your "down with C" campaign.

    Do /you/ have any idea how incredibly crass it is to have to explicitly incorporate headers to enable the most fundamental language features?

    Suppose you needed:

    #include <upper.h> # to allow upper case in identifiers
    #include <lower.h> # to allow lower case in identifiers

    So every program now needs lower.h. You're working away, and at some
    point it fails because you've used an upper case macro name, and the
    compiler throws an error.

    No problem! Just edit the file to add the include for upper.h at the top.

    Ridiculous, yes? That's exactly how I see most of C's standard headers.
    Just enable the lot by default, and increase the productivity of the
    world's C programmers by a couple of percentage points.


    Shifted punctuation is worse.


    So you'd rather write "x - -y" than "x + y", because it avoids the shift
    key? That seems like a somewhat questionable choice of priorities.

    Some is unavoidable. But a lot of it is.

    for (i = 1; i<=N; ++i) {
    printf("%d %f\n", i, sqrt(i)); # 16 shifted keys
    }

    for i to n do # 0 shifted keys
    println i, sqrt i
    od

    In real code, C may use 30-50% more shifted keys than the equivalent in
    my syntax, not counting shifted alphabetics because of mixed or upper
    case (mine is case-insensitive, so I can choose to write 'messagebox',
    not 'MessageBox').

    But I tend to use them most often in temporary debug code, which does
    use lots of prints and loops like my example.


        * An alternative to that god-forsaken, error prone syntax
        * Freedom from case-sensitivity
        * 1-based arrays!
        * An ACTUAL byte/u8 type without all the behind-the-scenes
          nonsense, and the need for stdint/inttypes etc
        * 64-bit integer types as standard
        * A grown-up Print feature
        * etc etc

    What /are/ the actual alternatives available as the next C replacement;
    Rust and Zig? You're welcome to them!


    If there were a significant number of people who wanted a language with
    these features, there would be one.

    There are a few with such features, unfortunately not all in the same
    language! (Ada has the first 3 of my list, but also has an impossible
    type system.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to James Harris on Sun Nov 28 18:18:35 2021
    On 28/11/2021 15:54, James Harris wrote:
    On 28/11/2021 13:46, David Brown wrote:
    On 28/11/2021 10:33, James Harris wrote:
    On 27/11/2021 15:17, David Brown wrote:


    So for me, your "alignbits 3" is just wrong - it makes no sense.  You >>>> are trying to say it should be aligned with 8-byte alignment, also
    known
    as 64-bit alignment.  Obviously I can figure out what you meant - there >>>> really isn't any other possibility for "alignbits 3".  But if you had >>>> written "alignbits 8", I would take that to mean a packed or unaligned >>>> declaration, not one with 256-byte alignment.

    BTW, you may be assuming octet addressing. That's not always the case.

    It /is/ the case on any system where your language will be used.
    Non-octal addressing is only used on legacy mainframes for which almost
    no new code is written, and on a few niche devices such as some DSPs.
    Don't kid yourself - the chances of your language being used on any of
    these is not low, it is zero. Trying to add any kind of flexibility or
    support for anything other than 8-bit bytes would be a disservice to
    your potential users.



    On that, I wonder if I could persuade you to think in terms of the
    number of bits. AISI there are two ways one can specify alignment: a
    power of two number of bytes that the alignment has to be a multiple of
    and the number of zero bits on the RHS.

    Those are equivalent in terms of the actual implementation, but not in
    the way a programmer is likely to think (or want to think).  The whole
    point of a programming language above the level of assembly is that the
    programmer doesn't think in terms of underlying representations in bits
    and bytes, but at a higher level, in terms of values and the meanings of
    the values.

    That's all very well but if you are thinking about alignment of values
    or structures you are already working at a low level.


    That is somewhat true. But there is no need to go lower than necessary.

    Further, say you had your alignment in the way you prefer, i.e. as a
    number of bytes such as 8. What would you write if you wanted to apply a commensurate shift? To get from 8 to 3 you'd need some sort of
    log-base-2 function of the type I showed earlier. Which would you want a language to provide?


    Again, I want /you/ to think about what /your/ users will actually need
    and use. When would they need this? Is it really something people will
    need? I believe you are trying to optimise for non-existent use-cases,
    instead of realistic ones. If you believe otherwise, please say so -
    perhaps with examples or justification. (It's your language, you don't
    /have/ to justify your choice of features, but it makes it easier to
    give helpful suggestions.)

    All in all, I put it to you that going from 3 to 8 is easier. :-)


    I agree it is easier to go that way. But since I don't think that is
    something that will often be needed, I don't see its ease as being
    important.

    And of course there is nothing to stop you doing the equivalent of

    #define struct_align_needed 3
    alignas(1 < struct_align_needed) struct S s;

    or whatever. In other words, if you really need to go from 3 to 8, then
    you can happily do that even if your "align" method takes an 8 rather
    than a 3.


    If I write "int * p = &x;", I think of "p" as a pointer to
    the variable "x".  I don't think about whether it is 64-bit or 32-bit,
    or whether it is an absolute address or relative to a base pointer, or
    how it is translated via page tables.  Considering the number of zero
    bits in the representation of the address is at a completely different
    level of abstraction from what I would see as relevant in a programming
    language.

    Well, if an address is guaranteed to be at a certain alignment then asm programmers may store flags in the lower bits. AFAICS that's not too
    easy to do in C so C programmers don't think in those terms - perhaps
    putting the flags in a separate integer on their own. But there can be
    value in using such bits and some hardware structures already include
    them. Specifying an address as aligned can make low bits available.

    It is quite rare that it makes sense to use those extra bits like that.
    And no, it is not particularly hard to do so in C - you just need to be
    a little careful (and its likely to be somewhat non-portable).


    At the end of the day, confining layout details to /declarations/ means
    that the body of an algorithm can still work with the elements which
    make sense for the task in hand, leaving the compiler to handle the
    details of accessing them.


    Yes - that is why you very rarely need to specify alignment. The
    compiler should know the rules for what makes sense on the platform and
    the ABI in use.




    When specifying constants it's easier to begin with and convert from the >>> number of bits. Consider the opposite. Given

       constant ALIGN_BYTES = 8

    there are these two ways one might convert that to alignment bits.

       constant ALIGN_BITS = Log2RoundedUp(ALIGN_BYTES)
       constant ALIGN_BITS = Log2ButErrorIfNotPowerOfTwo(ALIGN_BYTES)

    IOW (1) there are two possible interpretations of the conversion and,
    perhaps worse, (2) either would need a special function to implement it. >>>

    This is all completely trivial to implement in your
    compiler/interpreter.  Users are not interested in the number of zero
    bits in addresses - and they are not interested in the effort it takes
    to implement a feature.  If you want a programming language that is more
    than a toy, a learning experiment, or a one-man show, then you must
    prioritise the effort of the user by many orders of magnitude over the
    convenience of the implementer.

    I was talking about the facilities being made available to programmers,
    not just those I would use internally!


    Programmers don't need these - it's not something they have to do. And
    if they /do/, then they can do so with :

    constant ALIGN_BITS = 3
    constant ALIGN_BYTES = 1 << ALIGN_BITS

    or maybe:

    constant ALIGN_BYTES = 8
    constant ALIGN_BITS = 3
    static_assert(ALIGN_BYTES == 1 << ALIGN_BITS,
    "Failed alignment sanity check")

    You are inventing non-existent problems here.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Sun Nov 28 18:24:33 2021
    On 28/11/2021 17:26, Bart wrote:

    There are a few with such features, unfortunately not all in the same language! (Ada has the first 3 of my list, but also has an impossible
    type system.)


    You really do think the world should revolve around /you/, don't you?
    You probably also write letters to your local newspaper complaining that
    the breakfast cereals you personally prefer are on the top shelf rather
    than at a more convenient height.

    Most people would be very happy to be in the position where the most
    difficult part of their job was having to press the shift key several
    times per day.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Sun Nov 28 19:25:35 2021
    On 28/11/2021 17:24, David Brown wrote:
    On 28/11/2021 17:26, Bart wrote:

    There are a few with such features, unfortunately not all in the same
    language! (Ada has the first 3 of my list, but also has an impossible
    type system.)


    You really do think the world should revolve around /you/, don't you?
    You probably also write letters to your local newspaper complaining that
    the breakfast cereals you personally prefer are on the top shelf rather
    than at a more convenient height.

    Most people would be very happy to be in the position where the most difficult part of their job was having to press the shift key several
    times per day.



    Look: when I first started programming, then these characteristics were
    common:

    * Case insensitive (in code, file system and CLIs)

    * 1-based indexing, with A[i,j] for 2D accesses

    * Keyword-based block delimiters (do...end, not {...})

    * Proper Read A, B, C and Print A, B, C features ...

    * ... and line-based processing of text files

    * Linear, left-to-right type specifiers

    I liked those, they worked well, and I incorporated them into my own
    stuff (I used N-based indexing, which defaulted to 1-based)

    But I didn't think them remarkable, until years later when the
    combination of Unix+C started to take over the world, and I first came
    across the alternatives that that combo was trying to inflict on
    everyone else:

    * Case sensitive (in code, file system and CLIs)

    * 0-based indexing, with A[i][j] for 2D accesses

    * Brace-based delimiters for everything (all statements, all data)

    * Off-language, library-based I/O with 'format specifiers' ...

    * ... and character-based processing of text files

    * For C, convoluted inside-out type specifiers that even the designers
    admitted was a mistake (with everyone else pretending they were a
    good idea)

    In every case, for reasons I won't go into here, I found those inferior.

    Why SHOULDN'T I be allowed to have my own preferences, and why SHOULDN'T
    I complain when those have been marginalised in favour of inferior
    practices?

    Getting back to what this is about, which was your suggestion that C is
    so perfect, it is pointless to create something new unless it comes with
    a raft of advanced, heavy features, then why SHOULDN'T there be an
    alternative systems language with it's OWN set of characteristics?

    Yes, C has pretty much won the war for ubiquitous systems language,
    although I don't remember there being any viable /mainstream/ contenders.

    It doesn't meant it's great; it means it's what most are stuck with.

    My own is a private language created years before I encountered C, and
    now it does have plenty of significant features that are not in C.

    (Like: proper value-array types, modules, keyword/optional parameters,
    proper for-loops, 64-bit default types, a true 'char' type, an actual
    BYTE type, strinclude, tabledata, proper switch...

    All ones I used every single day.)

    Why would I use C when I can use mine? Why shouldn't anyone be able to
    do the same?

    You of course just want to be patronising and insist any such project
    can only ever be a toy that someone does for fun until they come to
    their senses.

    This group is also about discussing aspect of language design. If you
    want to talk about some of your own ideas, regardless of whether you're
    going to implement them, then that would great.

    However you seem intent on trashing everyone's ideas and aspirations.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Walker@21:1/5 to David Brown on Sun Nov 28 21:27:17 2021
    On 28/11/2021 13:46, David Brown wrote:
    There is always going to be a huge spread between beginners (including
    those that never get beyond beginner stages no matter how long they
    spend), average and expert programmers. This is perhaps an unusual
    aspect of programming as a profession and hobby. Imagine there were
    such a spread amongst professional football ("soccer", for those living
    in the ex-colonies) players. On the same team as Maradonna you'd have someone who insists on picking up and carrying the ball, since it
    clearly works, and someone who could be outrun by an asthmatic snail.

    Hm. I'm not sure this is the best analogy you've ever
    produced! Every sport and hobby has beginners and experts;
    and it is at least as unusual in programming as in most other
    spheres to find world-class experts and novices in the same
    team [to the extent to which programming is even a team game!].

    Somewhat OTOH, in two of my principal interests, it is
    not unusual for rank amateurs to encounter top players. A few
    years back, I found myself playing against one of the world's
    top chess grandmasters; I did at least last longer than our
    board 2, whose opponent was a mere international master. We
    have had GMs and IMs playing in our local league. In music,
    it is quite common for student/amateur orchestras and other
    ensembles to engage top pianists/violinists/singers to play
    concertos and perhaps give lessons.

    In a third main interest, a friend who was organising
    a lower-league local cricket match was rather surprised to be
    contacted by the manager of the New Zealand tourists: "I've
    been told you have a match this afternoon?" "Yes." "Well,
    we have [top Test player] recovering from injury, and we'd be
    extremely grateful if he could play for you." "Um, you do
    realise we're not very good?" "That's fine, he just needs the
    exercise in a real match." So tTp did play, and apparently he
    was a really great guy, very friendly, joined in at the bar,
    and gave lots of top tips.

    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Ravel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Sun Nov 28 22:05:23 2021
    On 28/11/2021 13:46, David Brown wrote:
    On 28/11/2021 10:33, James Harris wrote:

    Unless you are designing a language to compete with Bart for the record
    of fewest users,

    The smallest number would zero; there must 1000s of dead languages used
    by nobody.

    And there must also be a number of personal or in-house or just rare
    instances of languages that only happen to be used by one person.

    That doesn't mean they are worthless, or not any good.

    Some of them may be implemented on top of more mainstream languages and
    tools, so they they not completely insular.

    I dare say you've also also chosen and configured your tools so that
    you're effectively working with a personal dialect of C or whatever;
    some of us just take it a bit further.

    Having a language become popular and in widespread use is simply not one
    of my aims and never has been. I know how fanatical people are about
    languages, and I'm not interested in persuading anyone to switch.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Mon Nov 29 10:05:06 2021
    On 28/11/2021 20:25, Bart wrote:
    On 28/11/2021 17:24, David Brown wrote:
    On 28/11/2021 17:26, Bart wrote:

    There are a few with such features, unfortunately not all in the same
    language! (Ada has the first 3 of my list, but also has an impossible
    type system.)


    You really do think the world should revolve around /you/, don't you?
    You probably also write letters to your local newspaper complaining that
    the breakfast cereals you personally prefer are on the top shelf rather
    than at a more convenient height.

    Most people would be very happy to be in the position where the most
    difficult part of their job was having to press the shift key several
    times per day.



    Look: when I first started programming, then these characteristics were common:

    Why should anyone care?

    When I started watching TV, there were three channels, and most TV's
    were black and white and about as deep as they were high. Does that
    mean I think everyone should go back to such boxes? Do I moan and whine
    that manufacturers are breaking some magical unwritten contract because
    now you can't put a pot-plant on top of the TV?

    When I started programming, I used BASIC. It was a great language for a
    8-year old kid to learn programming. It was a shite language for
    serious code and useful programs, and is totally unsuitable for anything
    but a glorified "Word" macro in comparison to today's languages and
    tools. I learned from it, and moved on.

    When I started assembly programming, I had to hand-assembly everything
    into hex using tables of opcodes. I debugged using the sound of the
    power supply on my Spectrum. I learned a lot from those days too, and
    moved on.

    Do I think anyone here cares what /I/ used when /I/ learned to program?
    No. Why should someone making new languages today care what /you/ used?

    The past is gone. We can learn from it - look at what worked back then,
    and what did not work. Look at what people kept, look at what changed.
    Look at what concepts remained constant, and what has not. Look at
    which fashions came and went, look at which went away then came back.
    Ask why.

    But only a fool would want to go back to the past.


      * Case insensitive (in code, file system and CLIs)

    That stems from a time when computers had six bits for a character
    because 8 bits would cost too much, and people used teletype instead of
    screens and keyboards. If you have trouble getting your cases right,
    you are in the wrong job.


      * 1-based indexing, with A[i,j] for 2D accesses

    1-based counting is good for everyday counting, not for programming.
    You want to program? Learn some maths. (For higher level languages,
    arrays that are indexable by different ranges, types or tuples is good.)


      * Keyword-based block delimiters (do...end, not {...})

    That comes from a time when keyboards with symbols such as { } were
    considered advanced and modern. (Hence those monstrosities, the
    trigraph and digraph.) Oh, I forgot - you find it such an effort to
    press the "shift" key on your keyboard.


      * Proper Read A, B, C and Print A, B, C features ...

    What a pointless and meaningless statement. There are a hundred and one different ways to do "proper" read and print, with everyone having their
    own ideas about what is best. Most people, of course, realise that
    programming languages are designed for more than one programmer and thus
    such features are invariably a compromise.

    Why SHOULDN'T I be allowed to have my own preferences, and why SHOULDN'T
    I complain when those have been marginalised in favour of inferior
    practices?

    Because you are wrong.

    And you are a margin of /one/, because you believe that languages should
    follow exactly what /you/ want in all aspects, with a total disregard
    for anyone else.

    Because - and I really can't emphasise this enough - we've heard it all
    before. Many, /many/ times. Endlessly, repeatedly. You think the
    gates of hell opened up the day C was conceived and the first release of
    Unix was the start of Ragnarök. We know. Get over yourself.


    Yes, you can have your opinion. Yes, you can make your own language the
    way /you/ want it. Yes, you can give suggestions and ideas based on
    these in a discussion about languages. No, you can't tell people that
    you alone are right, and the rest of the world is wrong, and expect to
    be taken seriously.



    Getting back to what this is about, which was your suggestion that C is
    so perfect, it is pointless to create something new unless it comes with
    a raft of advanced, heavy features, then why SHOULDN'T there be an alternative systems language with it's OWN set of characteristics?


    You consistently demonstrate that you have no clue as to what threads
    here are about. You have such a fanatic and unreasoned loathing of C
    that you are unable to understand what people write - you make totally unwarranted assumptions and then fly off your handle to attack the
    mirages of your mind.

    I /could/ explain what I had written earlier. But what would be the
    point? It would just repeat the same things I wrote before. You didn't
    read them then, why should I think you'll read them now?

    This group is also about discussing aspect of language design. If you
    want to talk about some of your own ideas, regardless of whether you're
    going to implement them, then that would great.

    However you seem intent on trashing everyone's ideas and aspirations.

    I /have/ been discussing ideas and suggestions here. Some have been of interest to James, others not - which is fine. And I write comments
    aimed at making him (or anyone else interested in languages) think about
    how the language might be used - because that's what's important. I am
    not the one who thinks every thread is an opportunity for a new anti-C rant.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Andy Walker on Mon Nov 29 12:09:34 2021
    On 28/11/2021 22:27, Andy Walker wrote:
    On 28/11/2021 13:46, David Brown wrote:
    There is always going to be a huge spread between beginners (including
    those that never get beyond beginner stages no matter how long they
    spend), average and expert programmers.  This is perhaps an unusual
    aspect of programming as a profession and hobby.  Imagine there were
    such a spread amongst professional football ("soccer", for those living
    in the ex-colonies) players.  On the same team as Maradonna you'd have
    someone who insists on picking up and carrying the ball, since it
    clearly works, and someone who could be outrun by an asthmatic snail.

        Hm.  I'm not sure this is the best analogy you've ever
    produced!  Every sport and hobby has beginners and experts;
    and it is at least as unusual in programming as in most other
    spheres to find world-class experts and novices in the same
    team [to the extent to which programming is even a team game!].


    The difference, I think, is that in programming you /do/ get a wide
    range even within teams. And you certainly get a very wide range of
    people working as programmers in different places.

    In particular, it is not just a mix of amateurs and professionals - as
    your experience shows, you can get that in many fields. In programming,
    you get people making a living as programmers despite being completely incompetent. And even amongst people who do a reasonable job, you can
    get an order of magnitude difference in productivity.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to David Brown on Mon Nov 29 12:40:50 2021
    On 2021-11-29 12:09, David Brown wrote:

    In programming,
    you get people making a living as programmers despite being completely incompetent.

    Reminds me of politicians, pop musicians, journalists, economists, environmentalists... (put quotation marks as appropriate)

    And even amongst people who do a reasonable job, you can
    get an order of magnitude difference in productivity.

    That is the 80/20 law.

    But I agree with you, incompetence is strong with programmers...

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Mon Nov 29 13:06:04 2021
    On 29/11/2021 09:05, David Brown wrote:
    On 28/11/2021 20:25, Bart wrote:

    Why should anyone care?

    When I started watching TV, there were three channels, and most TV's
    were black and white and about as deep as they were high.

    In 1962 we had a TV with only 1 channel, in black and white and with 405
    lines.

    In the same year, Lawrence of Arabia was released in cinemas in 70mm
    format (somewhat beyond 4K quality). The screen was pretty flat too!

    Not sure how such things are relevant, my remarks are about
    characteristics of programming languages, not hardware.

      * Case insensitive (in code, file system and CLIs)

    That stems from a time when computers had six bits for a character
    because 8 bits would cost too much, and people used teletype instead of screens and keyboards.
    All languages and all OSes were using the same hardware. Yet Unix+C went
    for case-sensitivity, other made a different choice.

    A /choice/. That doesn't make it right and the others wrong.

    I used machines with both upper and lower case capability from 1982; I
    still prefered case-insensitivity because it was generally better and
    more user-friendly.

    If you have trouble getting your cases right,
    you are in the wrong job.

    If you have trouble thinking up distinct identifiers in examples like this:

    Abc abc = ABC;

    then /you're/ in the wrong job!


      * 1-based indexing, with A[i,j] for 2D accesses

    1-based counting is good for everyday counting, not for programming.

    Bollocks. In any case, you snipped my remark that I implement N-based
    arrays, so that I can use 0-based /as needed/, and have always done.

    You haven't explained why A[i][j] is better than A[i,j].

      * Keyword-based block delimiters (do...end, not {...})

    That comes from a time when keyboards with symbols such as { } were

    So you see it as progress that {,} with their innumerable issues were introduced. Because this:

    } else {

    or:

    }
    else {

    or:

    } else
    {

    or:

    }
    else
    } # (oops!)

    etc. is SO much better than just:

    else

    You must be delusional.


    considered advanced and modern. (Hence those monstrosities, the
    trigraph and digraph.) Oh, I forgot - you find it such an effort to
    press the "shift" key on your keyboard.


      * Proper Read A, B, C and Print A, B, C features ...

    What a pointless and meaningless statement. There are a hundred and one different ways to do "proper" read and print, with everyone having their
    own ideas about what is best.

    This is just pure jealousy. Show me the C code needed to do the
    equivalent of this (without knowing the types of a, b, c other than they
    are numeric):

    print "?"
    readln a, b, c
    println a, b, c

    Here the language provides informal line-based i/o, as might be useful
    for interactive programs, or reading/writing files, while still allowing
    more precise control as needed.

    Because you are wrong.

    And you are a margin of /one/, because you believe that languages should follow exactly what /you/ want in all aspects, with a total disregard
    for anyone else.

    What exactly are the choices for someone in 2021 who wants to use (or is required to use) a language like C, but favours even one of my
    characteristics?

    Yes, you can have your opinion. Yes, you can make your own language the
    way /you/ want it. Yes, you can give suggestions and ideas based on
    these in a discussion about languages. No, you can't tell people that
    you alone are right, and the rest of the world is wrong, and expect to
    be taken seriously.

    But YOU are allowed to say that:

    * Case-insensitivity is wrong
    * 1-based is wrong
    * A[i,j] is wrong
    * Anything other than {...} blocks is wrong
    * Easy read/print statements in a language are wrong
    * Line-based i/o is wrong
    * Left-to-right type syntax is wrong. (Did you say that, or decided not
    to mention that one?!)

    All things that C doesn't have.

    Actually, Ada and Fortran are still around, are case-insensitive, are
    N-based, and don't use brace syntax.

    Lua doesn't use braces for blocks. It is also 1-based.

    Also 1-based are Julia, Mathematica, and Matlab ("Learn some maths"? Sure!)

    Julia doesn't use braces either.

    These are all characterics that still exist across languages, but not necessarily within one system languages which is where I need them.




    Getting back to what this is about, which was your suggestion that C is
    so perfect, it is pointless to create something new unless it comes with
    a raft of advanced, heavy features, then why SHOULDN'T there be an
    alternative systems language with it's OWN set of characteristics?


    I /could/ explain what I had written earlier. But what would be the
    point? It would just repeat the same things I wrote before. You didn't
    read them then, why should I think you'll read them now?

    I'll repeat what you said:

    "Given the vast benefits of C in terms
    of existing implementation, code, experience and information, why would
    anyone bother with a different compiled language unless it let them do
    things that you cannot easily do in C?"

    You are clearly saying, don't bother creating an alternative to C unless
    it actually does something different.

    I disagreed: you CAN have an alternative that, while it does the same
    things, can achieve that differently.

    I listed some things out of many dozens. You of course with disagree
    with every one of them.

    Doesn't matter what it is:

    C does X C's way is perfect
    Bart does Y Bart is WRONG, and in a minority of one

    Even when I show that other languages, old or new, also do Y. Or when I
    give an example of Y clearly being better than X.

    My dislike of C is rational. Your loyalty to it, and hatred of anyone
    who dares to badmouth it, is irrational.
    I /have/ been discussing ideas and suggestions here. Some have been of interest to James, others not - which is fine. And I write comments
    aimed at making him (or anyone else interested in languages) think about
    how the language might be used - because that's what's important. I am
    not the one who thinks every thread is an opportunity for a new anti-C rant.

    No. Your message is 'Just don't bother trying to rewrite C', presumably
    because it is perfect.

    C is still VERY widely used, but you don't believe in an alternative.
    You want people to continue driving a Model T, unless the new car can
    also fly!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Mon Nov 29 16:19:48 2021
    On 29/11/2021 14:06, Bart wrote:
    On 29/11/2021 09:05, David Brown wrote:
    On 28/11/2021 20:25, Bart wrote:

    Why should anyone care?

    When I started watching TV, there were three channels, and most TV's
    were black and white and about as deep as they were high.

    In 1962 we had a TV with only 1 channel, in black and white and with 405 lines.

    In the same year, Lawrence of Arabia was released in cinemas in 70mm
    format (somewhat beyond 4K quality). The screen was pretty flat too!

    Not sure how such things are relevant, my remarks are about
    characteristics of programming languages, not hardware.

       * Case insensitive (in code, file system and CLIs)

    That stems from a time when computers had six bits for a character
    because 8 bits would cost too much, and people used teletype instead of
    screens and keyboards.
    All languages and all OSes were using the same hardware. Yet Unix+C went
    for case-sensitivity, other made a different choice.

    A /choice/. That doesn't make it right and the others wrong.


    Most programming languages in use today are case-sensitive. Those that
    are not are mostly leftovers from the days when computers SHOUTED at you because they didn't support lower case letters.

    Most filesystems in use today are case-sensitive. Those that are not
    are mostly leftovers from those same days. Even NTFS on Windows is a
    fully case-sensitive filesystem, and can happily support "readme.txt"
    and "Readme.txt" as different files in the same directory. The OS has a
    layer in its API to make the filesystem appear case-preserving but case-insensitive.

    Case insensitive doesn't work when you go beyond the UK/US alphabet.
    The complications for various languages are immense. In German, the
    letter ß traditionally capitalises as SS - one letter turns into two.
    In Turkish, "i" and "I" are two completely different letters, with their opposite cases being "İ" and "ı". It quickly becomes ridiculous when
    you need to support multiple languages. On the other hand,
    case-sensitive naming is usually just done as binary comparison.

    So unless you think that everyone should be forced to write a limited
    form of UK or US English and that ASCII is good enough for everyone, case-sensitive is the only sane choice for file systems.


    You can reasonably argue that the majority choice is not necessarily
    right. But you have a much harder time trying to argue that an outdated minority choice is right.


    I used machines with both upper and lower case capability from 1982; I
    still prefered case-insensitivity because it was generally better and
    more user-friendly.

    If you have trouble getting your cases right,
    you are in the wrong job.

    If you have trouble thinking up distinct identifiers in examples like this:

       Abc abc = ABC;

    then /you're/ in the wrong job!


    That's a strawman, and you know it. Or do you think it's fine to write:

    OO0O1I II1IIlI1 = OIOII1IIl0I;

    The ability to write sensible identifiers - or confusing ones - is not dependent on case sensitivity. (And please don't give us the tired old bullshit about having seen poor coding in some C code you found online
    that left you confused. It would merely show that you prefer
    cherry-picking to rational arguments, or that you are easily confused.)


       * 1-based indexing, with A[i,j] for 2D accesses

    1-based counting is good for everyday counting, not for programming.

    Bollocks. In any case, you snipped my remark that I implement N-based
    arrays, so that I can use 0-based /as needed/, and have always done.


    As I said, it can be good to have more flexible array indexes in a
    higher level language.

    But if you have just one starting point, 0 is the sensible one. You
    might not like the way C handles arrays (and I'm not going to argue
    about it - it certainly has its cons as well as its pros), but even you
    would have to agree that defining "A[i]" to be the element at "address
    of A + i * the size of the elements" is neater and clearer than
    one-based indexing. Again, 0 is the common choice, especially amongst
    lower level languages. (The worst possible choice, of course, is to
    have a configurable default starting number.)

    You haven't explained why A[i][j] is better than A[i,j].


    I didn't "explain" it because I don't agree - the two choices have their
    pros and cons. One views arrays as purely linear - so A is a linear
    array of elements, each of which is a linear array. The other views
    arrays like A as being a single object with multiple dimensions.
    Sometimes one viewpoint is better than the other.

    I can, however, note that I dislike C's comma operator. One of its disadvantages is that it means "A[i, j]" is interpreted as "evaluate i
    for it's side-effects, then treat as A[j]", which is not remotely helpful.

       * Keyword-based block delimiters (do...end, not {...})

    That comes from a time when keyboards with symbols such as { } were

    So you see it as progress that {,} with their innumerable issues were introduced. Because this:

      } else {

    or:

      }
      else {

    or:

      } else
      {

    or:

      }
      else
      }        # (oops!)

    etc. is SO much better than just:

      else

    You must be delusional.


    No, I am not delusional - the use of brackets is hugely better than
    relying on line-endings or spacing for block structuring. (And yes, I
    am fully aware that I use Python that uses indentation for structuring.)
    Mistakes like the one you made there are easily diagnosed by tools -
    unlike mistakes for when you don't have delimiting symbols.

    However, the choice you gave was not between brackets and nothing, but
    between brackets and keywords for delimiters. I find brackets
    convenient and light-weight, and very easy to see and use correctly when combined with a reasonable indentation strategy. I don't see it as a particularly big issue - "begin"/"end", or whatever, work fine too. But
    I see no advantage in them.

    (I /do/ see advantage in /requiring/ block delimiters in, for example, conditionals and loops. Making them optional is a source of errors,
    regardless of how they are spelt.)



    considered advanced and modern.  (Hence those monstrosities, the
    trigraph and digraph.)  Oh, I forgot - you find it such an effort to
    press the "shift" key on your keyboard.


       * Proper Read A, B, C and Print A, B, C features ...

    What a pointless and meaningless statement.  There are a hundred and one
    different ways to do "proper" read and print, with everyone having their
    own ideas about what is best.

    This is just pure jealousy. Show me the C code needed to do the
    equivalent of this (without knowing the types of a, b, c other than they
    are numeric):

       print "?"
       readln a, b, c
       println a, b, c

    In C, you don't work with variables whose types are unknown.

    You are under the delusion that there is one "correct" interpretation
    here. You think that /your/ ideas are the only "obvious" or "proper"
    way to handle things. In reality, there are dozens of questions that
    could be asked here, including:

    Does there have to be a delimiter between the inputs? Does it have to
    be comma, or space, or newline? Are these ignored if there are more
    than one? Are numbers treated differently in the input? Would an input
    of "true" be treated as a string or a boolean? Are there limits to the
    sizes? How are errors in the input, such as end-of-file or ctrl-C
    treated? How do you handle non-ASCII strings?

    Should there be spaces between the outputs? Newlines? Should the
    newline be a CR, an LF, CR+LF, or platform specific? What resolution or
    format should be used for the numbers? If someone had entered "0x2c"
    for one of the inputs, is that a string or a number - and if it is a
    number, should it be printed in hex or in decimal?

    Should the output go to the "standard out" stream, assuming that is
    supported by the language and the OS? The "standard error" stream? A
    printer? A debug port? A text box in a gui? Should it be determined
    by a wider context in some way, such as via functions that redirect the
    output of "println" statements?


    No matter how you implement such things, it will not be the right choice
    for some people in some cases. A language (and/or standard library) can
    make a reasonable starting point that is appropriate for a variety of
    uses of the language. And it can fully /document/ and /specify/ the
    behaviour. That's all - that's the best that can be done.


    (And note that I am /not/ saying that C is "perfect" here. C's "printf" solution has a lot of advantages, which is why it has often been copied
    in other languages, but it has a lot of disadvantages too. The same
    applies to your language's print statements.)


    Here the language provides informal line-based i/o, as might be useful
    for interactive programs, or reading/writing files, while still allowing
    more precise control as needed.

    Because you are wrong.

    And you are a margin of /one/, because you believe that languages should
    follow exactly what /you/ want in all aspects, with a total disregard
    for anyone else.

    What exactly are the choices for someone in 2021 who wants to use (or is required to use) a language like C, but favours even one of my characteristics?


    The same choices /everybody/ makes in /every/ aspect of their lives -
    you find the most suitable compromise. No one programs in C because
    they think it is a perfect language - they program in C because it is
    the best choice for their needs at the time, weighing up the advantages
    and disadvantages.

    You don't go to the bakers and say "I'd like a loaf of bread just like
    that one, except 30% longer". You choose between a smaller loaf than
    you wanted, or buying two and having too much bread, or buying a
    different loaf that is the right size but a different texture.

    If you are really keen on getting exactly the loaf you want but the
    bakers don't stock it, then you can learn to make bread yourself and
    make your own loaves that are exactly what /you/ want. But you don't
    expect them to be popular with other people.

    If you think that lots of people would like loaves that are 30% longer,
    then you can try and start a business making and selling them. That's
    fine too - though not easy.

    What you don't do, however, is go to the butcher's shop and complain to
    the butcher that the baker's loaves are so terrible.


    Picking a programming language is not really any different from any
    other kind of choice in life.


    Yes, you can have your opinion.  Yes, you can make your own language the
    way /you/ want it.  Yes, you can give suggestions and ideas based on
    these in a discussion about languages.  No, you can't tell people that
    you alone are right, and the rest of the world is wrong, and expect to
    be taken seriously.

    But YOU are allowed to say that:

    * Case-insensitivity is wrong
    * 1-based is wrong
    * A[i,j] is wrong
    * Anything other than {...} blocks is wrong
    * Easy read/print statements in a language are wrong
    * Line-based i/o is wrong
    * Left-to-right type syntax is wrong. (Did you say that, or decided not
    to mention that one?!)


    Yes, I am allowed to say that (though I most certainly did /not/ say
    that). But I am not allowed to expect everyone to agree with me just
    because I say so. See the difference? If I want anyone to take my
    opinions seriously (and I don't always expect that), I have to be able
    to justify them. "Case insensitivity is clearly better because I like
    it" is not a justification.

    All things that C doesn't have.

    Only you are arguing about C here - only you seem to imagine people
    think it is perfect. It is far and away the most successful programming language, massively used and massively popular, so it makes a good
    yardstick for comparisons and discussions. But nobody suggests it is an
    ideal language (I certainly have not done so).


    Actually, Ada and Fortran are still around, are case-insensitive, are N-based, and don't use brace syntax.


    If you take a sample of a thousand programmers, you can count on one
    hand the number that have any concept of those languages beyond "Ada is
    used by the US DoD" and "Fortran was used in the early days of
    programming". (Usenet is not a good sample, given its demographics.)

    Lua doesn't use braces for blocks. It is also 1-based.

    Also 1-based are Julia, Mathematica, and Matlab ("Learn some maths"? Sure!)

    Julia doesn't use braces either.

    These are all characterics that still exist across languages, but not necessarily within one system languages which is where I need them.




    Getting back to what this is about, which was your suggestion that C is
    so perfect, it is pointless to create something new unless it comes with >>> a raft of advanced, heavy features, then why SHOULDN'T there be an
    alternative systems language with it's OWN set of characteristics?


    I /could/ explain what I had written earlier.  But what would be the
    point?  It would just repeat the same things I wrote before.  You didn't >> read them then, why should I think you'll read them now?

    I'll repeat what you said:

    "Given the vast benefits of C in terms
    of existing implementation, code, experience and information, why would anyone bother with a different compiled language unless it let them do
    things that you cannot easily do in C?"

    You are clearly saying, don't bother creating an alternative to C unless
    it actually does something different.

    Yes. Surely that is obvious? There is no point in re-inventing the
    same wheel everyone else already uses - you have to bring something new
    to the table. (Or you are doing this all for fun and education.) And
    given how many people already use C, how many tools there are, how much
    code there is, you need /serious/ advantages over it in order for anyone
    to choose your language over C.


    I disagreed: you CAN have an alternative that, while it does the same
    things, can achieve that differently.

    No one will use it. So what's the point?

    It would not be impossible to design a new programming language that is
    of a similar level to C but has a fair number of technical improvements.
    (It is certainly possible to have lots of technical /differences/, but
    being different does not make it better just because one person prefers
    the change.)

    But can you make one that has enough technical improvements to gain any
    kind of following?

    Let's say that I agree that your language's "println" system is the
    bee's knees, that I have always found writing "int * p" confusing, and
    that I'd be much happier if I was able to write my identifiers in small
    letters when I am in a good mood and in capitals when I am feeling
    angry. Would that persuade me to throw away my existing compilers,
    debuggers, editors and change to your language? Should I change the
    tiny, cheap microcontrollers we use to embedded Windows systems as that
    is the only target you support? For C, I have the standards documents
    and reference sites, and compilers and libraries that follow these specifications, and an endless supply of knowledgeable users for help,
    advice, or hire - for your language, we have one guy off the internet
    who regularly fails to answer simple questions about the language he
    wrote without trying it to see the result.


    So, again, what is the point of a language that is roughly like C but
    with a few technical improvements and perhaps a nicer syntax (in some
    people's opinion) ?

    There is plenty of scope for making a good new programming language, but
    if it is going to be used, it needs to let people do what they are
    already doing, things they can do by moving to other established
    languages, /and/ something new.

    That means it doesn't just have to be a massive technical improvement
    over C. It also has to beat C++, Ada, Rust, D, Go, C#, OCaml, and even
    oldies like Forth and FORTRAN and "weird" choices like Haskell or Eiffel.


    I listed some things out of many dozens. You of course with disagree
    with every one of them.


    Again, you merely demonstrate your clouded prejudice that hinders you
    from reading anything people write.

    Doesn't matter what it is:

     C does X            C's way is perfect
     Bart does Y         Bart is WRONG, and in a minority of one

    Even when I show that other languages, old or new, also do Y. Or when I
    give an example of Y clearly being better than X.

    My dislike of C is rational. Your loyalty to it, and hatred of anyone
    who dares to badmouth it, is irrational.

    If someone thinks that C is perfect, or that your language was always
    wrong, or was blindly loyal to C, then I agree that would be irrational
    (or at the very least, ignorant). But I have expressed none of these
    things. I most certainly have not expressed hatred of you or anyone
    else. (I have accused you of hating /C/, not of hating any person.)

    Your problem here is that you cannot appreciate that someone can explain
    how C works, or why it is the way it is, or how to use it. You cannot
    grasp that people can find C useful, practical and enjoyable without
    treating them as though they view C as "perfect" and the paradigm of programming languages.

    I use C a lot - I know the language well and find it very useful for the programming tasks I have. I'll drop it in a heartbeat when I have a
    better alternative. (Indeed I have done so - when it is practical for a project, taking into account a wide range of factors, I use C++. And
    for PC programming I almost never use C.)

    These threads would be a lot more pleasant if you could wrap your head
    around that.

    Oh, and you should get over your delusion that your dislike of C is
    rational. Your dislike of /some/ aspects of C are rational (again, no
    one who has used C significantly likes it all - and the same applies to
    all programming languages). Some is purely a matter of taste (and
    that's fine). Much of it, however, is due to your wilful and stubborn insistence on making life hard for yourself. Such martyrdom is not
    becoming.

    I /have/ been discussing ideas and suggestions here.  Some have been of
    interest to James, others not - which is fine.  And I write comments
    aimed at making him (or anyone else interested in languages) think about
    how the language might be used - because that's what's important.  I am
    not the one who thinks every thread is an opportunity for a new anti-C
    rant.

    No. Your message is 'Just don't bother trying to rewrite C', presumably because it is perfect.


    You presume incorrectly.

    I have written a great many things in James' threads - few of which were
    about C. (And most of those were of the form "C does it this way - you
    might want to do it differently".)

    But this particular message was "Don't bother trying to rewrite C" -
    because C is already here. If you want to design a language, make a new
    one.

    C is still VERY widely used, but you don't believe in an alternative.
    You want people to continue driving a Model T, unless the new car can
    also fly!

    I hope you were not suggesting that /your/ language is somehow more
    modern than C! But perhaps you just wanted to end on a joke.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Mon Nov 29 22:15:10 2021
    On 29/11/2021 11:40, Dmitry A. Kazakov wrote:
    On 2021-11-29 12:09, David Brown wrote:

    In programming,
    you get people making a living as programmers despite being completely
    incompetent.

    Reminds me of politicians, pop musicians, journalists, economists, environmentalists... (put quotation marks as appropriate)

    And even amongst people who do a reasonable job, you can
    get an order of magnitude difference in productivity.

    That is the 80/20 law.

    But I agree with you, incompetence is strong with programmers...

    That's no good. We cannot have agreement on Usenet. ;-) So let me
    suggest that both of you have gone off the point (fine and permissible
    but a deviation nonetheless).

    What we were talking about was David espousing a language feature which
    was "not for the average programmer" and saying (AIUI) that it was fine
    to have average and expert programmers use different features. I
    disagree with that premise.

    I'd suggest to you that it's OK to have average and expert programmers
    but that that should relate to the quality of their output and how
    quickly it is produced. Different programmers should not, however, use different parts of the same language. Instead, a language should
    (ideally) be simple enough that both average and expert programmers can
    work with the same code.

    This is, again, about a language being a medium in which a programmer communicates. That communication can be with other programmers, not just
    with a compiler. (A lofty goal and perhaps unachievable but a very
    important goal to keep in mind, IMO.)


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to David Brown on Mon Nov 29 21:55:18 2021
    On 28/11/2021 17:18, David Brown wrote:
    On 28/11/2021 15:54, James Harris wrote:
    On 28/11/2021 13:46, David Brown wrote:
    On 28/11/2021 10:33, James Harris wrote:
    On 27/11/2021 15:17, David Brown wrote:

    ...

    Further, say you had your alignment in the way you prefer, i.e. as a
    number of bytes such as 8. What would you write if you wanted to apply a
    commensurate shift? To get from 8 to 3 you'd need some sort of
    log-base-2 function of the type I showed earlier. Which would you want a
    language to provide?


    Again, I want /you/ to think about what /your/ users will actually need
    and use. When would they need this? Is it really something people will need? I believe you are trying to optimise for non-existent use-cases, instead of realistic ones. If you believe otherwise, please say so -
    perhaps with examples or justification. (It's your language, you don't /have/ to justify your choice of features, but it makes it easier to
    give helpful suggestions.)

    Maybe we are talking about slightly different things but the question
    above wasn't asking for a suggestion but was an attempt to point out
    that your position (AIUI) wasn't pragmatic. I think you are focussing
    only on alignment but I am saying that whatever the use case (alignment
    or otherwise) it is more flexible if the defining value for a power of
    two is a bit offset rather than an integer. That's because the former
    can always be converted to the latter but not vice versa - unless you
    are going to invent some special log2 functions. My point was that I'd recommend you don't do that but define all powers of two via bit
    offsets, instead.

    I work that way all the time. A master file will have lines such as

    constant blocksizebits = 9

    then, later, possibly somewhere separate, there may be the definition of
    a /derived/ constant,

    constant blocksize = 1 << blocksizebits

    Then I only have to change one constant to adjust the system.

    ...

    And of course there is nothing to stop you doing the equivalent of

    #define struct_align_needed 3
    alignas(1 < struct_align_needed) struct S s;

    or whatever. In other words, if you really need to go from 3 to 8, then
    you can happily do that even if your "align" method takes an 8 rather
    than a 3.

    That illustrates my point: it's better to make the bit offset the master
    piece of info.

    ...

    When specifying constants it's easier to begin with and convert from the >>>> number of bits. Consider the opposite. Given

       constant ALIGN_BYTES = 8

    there are these two ways one might convert that to alignment bits.

       constant ALIGN_BITS = Log2RoundedUp(ALIGN_BYTES)
       constant ALIGN_BITS = Log2ButErrorIfNotPowerOfTwo(ALIGN_BYTES)

    ...

    Programmers don't need these - it's not something they have to do. And
    if they /do/, then they can do so with :

    constant ALIGN_BITS = 3
    constant ALIGN_BYTES = 1 << ALIGN_BITS

    or maybe:

    constant ALIGN_BYTES = 8
    constant ALIGN_BITS = 3
    static_assert(ALIGN_BYTES == 1 << ALIGN_BITS,
    "Failed alignment sanity check")

    You are inventing non-existent problems here.

    I see only a different opinion. I don't see anyone inventing a problem.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Mon Nov 29 22:40:44 2021
    On 29/11/2021 15:19, David Brown wrote:
    On 29/11/2021 14:06, Bart wrote:

    A /choice/. That doesn't make it right and the others wrong.

    Case insensitive doesn't work when you go beyond the UK/US alphabet.
    The complications for various languages are immense. In German, the
    letter ß traditionally capitalises as SS - one letter turns into two.
    In Turkish, "i" and "I" are two completely different letters, with their opposite cases being "İ" and "ı". It quickly becomes ridiculous when
    you need to support multiple languages. On the other hand,
    case-sensitive naming is usually just done as binary comparison.

    So unless you think that everyone should be forced to write a limited
    form of UK or US English and that ASCII is good enough for everyone, case-sensitive is the only sane choice for file systems.

    URLs are case-insensitive for the first part. So are email addresses and usernames. And usually, people's names when stored in a computer system.
    And addresses and postcodes. And movie and book titles. Etc.

    Those I guess are immune to the problems of Unicode.

    I feel that file names, which could be used to represent all those
    examples, and the commands of CLIs, should be the same.


    If you have trouble thinking up distinct identifiers in examples like this: >>
       Abc abc = ABC;

    then /you're/ in the wrong job!


    That's a strawman, and you know it.

    I see it all the time in C. Example from raylib.h:

    typedef struct CharInfo {
    int value; // Character value (Unicode)
    int offsetX; // Character offset X when drawing
    int offsetY; // Character offset Y when drawing
    int advanceX; // Character advance position X
    Image image; // Character image data
    } CharInfo;

    'Image image'; just try saying it out loud!

    But if you have just one starting point, 0 is the sensible one. You
    might not like the way C handles arrays (and I'm not going to argue
    about it - it certainly has its cons as well as its pros), but even you
    would have to agree that defining "A[i]" to be the element at "address
    of A + i * the size of the elements" is neater and clearer than
    one-based indexing.

    That's a crude way of defining arrays. A[i] is simply the i'th element
    of N slots, you don't need to bring offsets into it.

    With 0-based, there's a disconnect between the ordinal number of the
    element you want, and the index that needs to be used. So A[2] for the
    3rd element.

       print "?"
       readln a, b, c
       println a, b, c

    In C, you don't work with variables whose types are unknown.

    You may know the types, but they shouldn't affect how you write Read and
    Print. In C it does, and needs extra maintenance.

    You are under the delusion that there is one "correct" interpretation
    here. You think that /your/ ideas are the only "obvious" or "proper"
    way to handle things. In reality, there are dozens of questions that
    could be asked here, including:

    Print is one of the most diverse features among languages. Your
    objections would apply to every language. Many have Print similar to
    mine, for example there might be 'println'; so what newline should be
    used (one of your questions)?

    What I'm showing is a sensible set of defaults.

    Does there have to be a delimiter between the inputs? Does it have to
    be comma, or space, or newline?

    Think about it: this is for user input, it needs to fairly forgiving.
    Using character-based input as C prefers is another problem when trying
    to do interactive input. Programs can appear to hang as they silently
    wait for that missing number on the line, while extra numbers screw up
    the following line.


    Are these ignored if there are more
    than one? Are numbers treated differently in the input? Would an input
    of "true" be treated as a string or a boolean? Are there limits to the sizes? How are errors in the input, such as end-of-file or ctrl-C
    treated? How do you handle non-ASCII strings?

    Yeah, carry on listing so many objections, that in the end the language provides nothing. And requires a million programmers to each reinvent line-based i/o from first principles.

    You are just making excuses why your favourite languages don't provide
    such features.

    Should there be spaces between the outputs? Newlines? Should the
    newline be a CR, an LF, CR+LF, or platform specific? What resolution or format should be used for the numbers? If someone had entered "0x2c"
    for one of the inputs, is that a string or a number - and if it is a
    number, should it be printed in hex or in decimal?

    Usually such input is not language source code, it might be something
    like a config file or maybe a log or transaction file.

    If your requirements demand a full-blown language tokeniser, then you're
    doing it wrong; you don't parse source code using a Read statement!

    Should the output go to the "standard out" stream, assuming that is
    supported by the language and the OS?

    It goes to the console unless the program specifies otherwise; that's
    pretty standard.

    The "standard error" stream?

    stderror is an invention of C (or Unix, one of those) and is actually
    quite difficult to make use of outside that language.

    All things that C doesn't have.

    Only you are arguing about C here - only you seem to imagine people
    think it is perfect. It is far and away the most successful programming language, massively used and massively popular,

    Great. That means there is still a place for a systems language at this
    crude, lower level.

    It also means there is room for alternatives. Even if it means the
    alternative is something that is implemented on top of C because all the
    tools are in place.

    You are clearly saying, don't bother creating an alternative to C unless
    it actually does something different.

    Yes. Surely that is obvious? There is no point in re-inventing the
    same wheel everyone else already uses - you have to bring something new
    to the table.

    I invented /my/ first wheel because I didn't have any!

    Then I found that my wheels was smaller, simpler, faster and generally a
    better fit than other solutions.

    I disagreed: you CAN have an alternative that, while it does the same
    things, can achieve that differently.

    No one will use it. So what's the point?

    /I/ will use it. And I will get a kick out of using it. After all not
    many get to use their own languages for 100% of their work.

    Should I change the
    tiny, cheap microcontrollers we use to embedded Windows systems as that
    is the only target you support?

    Mine aren't general purpose in the sense that they are for my own use,
    and they target whatever hardware I happen to be using, which currently
    is Win64. So I'm not here to flog my language. Only discussing portable
    ideas.

    However previous versions have targetted:

    CPU Size OS

    PDP10 36-bit TOPS10(?) (Not my lang, but first time self-hosted)
    Z80 8-bit None
    Z80 OS/M (CP/M ripoff)
    Z80 (PCW)
    8088/86 16-bit MSDOS (plus None for some projects)
    80386 32-bit MSDOS/Windows
    x64 64-bit Windows (current)
    (C32) 32-bit Windows/Linux, x86/ARM32 (Versions with C targets)
    (C64) 64-bit Windows/Linux, x64/ARM64

    If I wanted, I could adapt my language to a small device, but it would
    have to work as a cross-compiler, and I'd need a minimum spec.

    BTW, would any of C#, Java, D, Rust, Zig, Odin, Go (or Algol68) work on
    those microcontrollers of yours?


    For C, I have the standards documents
    and reference sites, and compilers and libraries that follow these specifications, and an endless supply of knowledgeable users for help, advice, or hire - for your language, we have one guy off the internet
    who regularly fails to answer simple questions about the language he
    wrote without trying it to see the result.

    Yeah, and on Reddit, there's an endless stream of the same questions
    about C due to all its quirks!

    And for you, I bet I could find a choice macro whose output you probably wouldn't be able to guess without trying it out.


    So, again, what is the point of a language that is roughly like C but
    with a few technical improvements and perhaps a nicer syntax (in some people's opinion) ?

    Well, the language exists. It is a joy to use. It is easy to navigate.
    It is easy to type. It has modules. You don't need declarations, just definitions. It has out-of-order everything. It fixes 100 annoyances of
    C. It provides a whole-program compiler. It builds programs very
    quickly. It has a self-contained one-file implementation. It has a
    companion scripting language in the same syntax and with higher level types.

    So, I should just delete a language I've used for 40 years and just code
    in C with all those brackets, braces and semicolons like every other
    palooka who needs to use an off-the-shelf language?

    I should sell my familiar, comfortable car and drive that Model T? It
    would be more like getting the bus; I'd rather walk!

    I hope you were not suggesting that /your/ language is somehow more
    modern than C!
    Actually it is. Why, in what way is C (the language, not all those shiny
    new IDEs), more modern than the language I have?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Tue Nov 30 08:46:28 2021
    On 2021-11-29 23:15, James Harris wrote:
    On 29/11/2021 11:40, Dmitry A. Kazakov wrote:
    On 2021-11-29 12:09, David Brown wrote:

    In programming,
    you get people making a living as programmers despite being completely
    incompetent.

    Reminds me of politicians, pop musicians, journalists, economists,
    environmentalists... (put quotation marks as appropriate)

    And even amongst people who do a reasonable job, you can
    get an order of magnitude difference in productivity.

    That is the 80/20 law.

    But I agree with you, incompetence is strong with programmers...

    That's no good. We cannot have agreement on Usenet. ;-) So let me
    suggest that both of you have gone off the point (fine and permissible
    but a deviation nonetheless).

    What we were talking about was David espousing a language feature which
    was "not for the average programmer" and saying (AIUI) that it was fine
    to have average and expert programmers use different features. I
    disagree with that premise.

    It is OK even for experts to use different language features. It depends
    on the task. Furthermore there are SW development roles like SW
    architect etc requiring higher qualification and the corresponding
    language parts for these.

    Instead, a language should
    (ideally) be simple enough that both average and expert programmers can
    work with the same code.

    If the underlying concepts are inherently complex the language cannot
    simplify them enough.

    This is, again, about a language being a medium in which a programmer communicates. That communication can be with other programmers, not just
    with a compiler. (A lofty goal and perhaps unachievable but a very
    important goal to keep in mind, IMO.)

    Nobody ever objected that.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Tue Nov 30 08:14:33 2021
    On 30/11/2021 07:46, Dmitry A. Kazakov wrote:
    On 2021-11-29 23:15, James Harris wrote:

    ...

    What we were talking about was David espousing a language feature
    which was "not for the average programmer" and saying (AIUI) that it
    was fine to have average and expert programmers use different
    features. I disagree with that premise.

    It is OK even for experts to use different language features. It depends
    on the task. Furthermore there are SW development roles like SW
    architect etc requiring higher qualification and the corresponding
    language parts for these.

    AISI that's only true if the language is complex enough to have parts
    which are not needed in normal programming. Are those parts really needed?


    Instead, a language should (ideally) be simple enough that both
    average and expert programmers can work with the same code.

    If the underlying concepts are inherently complex the language cannot simplify them enough.

    Interesting comment. Which concepts cannot be simplified?


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Tue Nov 30 10:22:25 2021
    On 2021-11-30 09:14, James Harris wrote:
    On 30/11/2021 07:46, Dmitry A. Kazakov wrote:
    On 2021-11-29 23:15, James Harris wrote:

    ...

    What we were talking about was David espousing a language feature
    which was "not for the average programmer" and saying (AIUI) that it
    was fine to have average and expert programmers use different
    features. I disagree with that premise.

    It is OK even for experts to use different language features. It
    depends on the task. Furthermore there are SW development roles like
    SW architect etc requiring higher qualification and the corresponding
    language parts for these.

    AISI that's only true if the language is complex enough to have parts
    which are not needed in normal programming. Are those parts really needed?

    Normal programming for you, abnormal for others.

    Instead, a language should (ideally) be simple enough that both
    average and expert programmers can work with the same code.

    If the underlying concepts are inherently complex the language cannot
    simplify them enough.

    Interesting comment. Which concepts cannot be simplified?

    - Concurrency, synchronization, tasking, active objects, protected
    objects, rendezvous, barriers, scheduling

    - Memory management, pools, collectors

    - Generic programming, classes, polymorphism

    - Interfacing with other languages and the OS

    - Representation control, memory layout, alignment, packing,
    volatile/atomic access and operations

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Tue Nov 30 14:58:56 2021
    On 29/11/2021 23:40, Bart wrote:
    On 29/11/2021 15:19, David Brown wrote:
    On 29/11/2021 14:06, Bart wrote:

    A /choice/. That doesn't make it right and the others wrong.

    Case insensitive doesn't work when you go beyond the UK/US alphabet.
    The complications for various languages are immense.  In German, the
    letter ß traditionally capitalises as SS - one letter turns into two.
    In Turkish, "i" and "I" are two completely different letters, with their
    opposite cases being "İ" and "ı".  It quickly becomes ridiculous when
    you need to support multiple languages.  On the other hand,
    case-sensitive naming is usually just done as binary comparison.

    So unless you think that everyone should be forced to write a limited
    form of UK or US English and that ASCII is good enough for everyone,
    case-sensitive is the only sane choice for file systems.

    URLs are case-insensitive for the first part. So are email addresses and usernames. And usually, people's names when stored in a computer system.
    And addresses and postcodes. And movie and book titles. Etc.

    You are basing this all on your limited experience of English language
    usage. Usernames in English language Windows logins are
    case-insensitive - that does not apply to all kinds of usernames, all
    kinds of systems, all kinds of languages. The first part of email
    addresses were originally ASCII only, and could be case-insensitive.
    Now some email servers can support different characters with UTF-8
    (encoded in some way over ASCII, I believe) - these could be
    case-sensitive or case-insensitive. Most English-language names, words, titles, etc., are case-insensitive, but not all.

    You are generalising too much here. I fully agree that on many things
    in daily life, we don't care much about letter case when distinguishing
    things - and we very rarely choose to have two things that are
    distinguished only by capitalisation. (Primarily because it is hard to
    hear the difference.) And some computer-related things are also case-insensitive, and sometimes that is convenient.

    But lots of things are case-sensitive, even in normal usage. "bart",
    "ADA" and "Fortran" are all misspelt. If I refer to you as "BART",
    you'd feel differently than if I use "Bart".

    And in the computer world, it is just vastly easier to have
    case-sensitive (and in general to distinguish based on the underlying
    bytes - so that " " and " " are considered different). If you try to
    be case-insensitive, you are going to get things wrong sooner or later, especially if you go outside English.


    Those I guess are immune to the problems of Unicode.

    Unicode has nothing to do with it. Non-Latin alphabets are an important
    factor here, but they are not the only thing, and the encoding of the characters is irrelevant.

    For other joys to consider outside of Latin - some languages have
    multiple different writing systems or alphabets. Some languages have
    different versions of letters depending on their position in words as
    well as their "capitalisation". The very concept of "letter case" is
    based entirely around the way we write Latin alphabet characters from
    the middle ages onwards (the etymology of the term is from Gutenberg
    printing presses).


    I feel that file names, which could be used to represent all those
    examples, and the commands of CLIs, should be the same.


    You are a product of your environment (this is not a criticism, it is
    human nature). You whole experience, especially in anything
    computer-related, has been in ASCII. A large proportion of it has been
    from a time when computers couldn't do anything else, and all the rest
    has been using English-language systems with keyboards that are simple
    ASCII only. You've never been able to type "naïve" or "café" with the correct English-language spelling, without jumping through hoops with a "character map" program. You've (almost) never used a command line
    terminal that can work with non-ASCII characters.

    This means that for you, working with case-insensitive strings is easy.
    It's a bit of extra effort in the code, but not much. And it doesn't
    cause confusion or annoyance. In the wider world, however, it is a very different matter - it is either language-specific and potentially
    complicated, or if it is supposed to support multiple languages it is /extremely/ complicated. And that means it is usually wrong.

    So having established that case-insensitivity is the only sensible
    choice for a lot of computer-related uses, given that it is the only
    choice that gets things right, what are the disadvantages? What are the advantages of being case /insensitive/ ?

    Sometimes it is nicer to see sorted lists as case-insensitive. But
    that's a whole new can of worms - sorting is again highly
    language-specific, and is another topic on its own. Even regardless of
    case, sorting is complicated. A good example here is the letter "Å" in Norwegian. It is regarded as a separate letter, and comes at the end of
    the alphabet. But sometimes it is transliterated to "AA". So in an
    alphabetic list of names, "Aaron" might come first, while "Aase" is
    sorted beside "Åse" at the end of the list. Even in English, in a list
    of names "MacDonald" is sorted beside "McDonald".


    But for commands, file names, program identifiers? Why would you want
    them to be case insensitive? I mean, I agree that you don't want
    commands "Copy", "copy" and "COPY" that all do different things. But
    given a command "copy", why would you ever want to type "COPY" ? Given
    a variable "noOfWhatsits", what is the benefit of letting "NOofwhatSitS"
    mean the same? A language tool could easily have a warning about having
    two identifiers that differ only in their letter case - that's no reason
    to want case-insensitive identifiers.


    If you have trouble thinking up distinct identifiers in examples like
    this:

        Abc abc = ABC;

    then /you're/ in the wrong job!


    That's a strawman, and you know it.

    I see it all the time in C. Example from raylib.h:

      typedef struct CharInfo {
        int value;              // Character value (Unicode)     int offsetX;            // Character offset X when drawing     int offsetY;            // Character offset Y when drawing     int advanceX;           // Character advance position X     Image image;            // Character image data
      } CharInfo;

    'Image image'; just try saying it out loud!


    If a language (or project) uses a convention that types start with a
    capital and variables start with a small letter, then this is perfectly
    clear.

    But if you have just one starting point, 0 is the sensible one.  You
    might not like the way C handles arrays (and I'm not going to argue
    about it - it certainly has its cons as well as its pros), but even you
    would have to agree that defining "A[i]" to be the element at "address
    of A + i * the size of the elements" is neater and clearer than
    one-based indexing.

    That's a crude way of defining arrays. A[i] is simply the i'th element
    of N slots, you don't need to bring offsets into it.


    Zero-based indexing is simple, clear, consistent and easy to build on
    for something more advanced (either in the language, or in user code).
    It is certainly low-level, but that's where you want to start.

    With 0-based, there's a disconnect between the ordinal number of the
    element you want, and the index that needs to be used. So A[2] for the
    3rd element.

    That can seem a little odd at first, depending on where you are starting
    - if you are used to lower level work then 0 is the obvious and natural starting point. (Bit 0 is the least significant bit in almost all
    bit-level work, except for PowerPC and related architectures.) People
    quickly get used to it.

    Some of the things you complain about in C are issues that seem to
    bother a number of C programmers - some of them even bug me! But I
    don't feel zero-based arrays are one of them - it really is not a
    problem for people, and it makes life simpler when you want to do
    something less common. (In contrast, the decay of array expressions to
    pointer expressions is something that is often surprising to beginners
    of C.)


        print "?"
        readln a, b, c
        println a, b, c

    In C, you don't work with variables whose types are unknown.

    You may know the types, but they shouldn't affect how you write Read and Print. In C it does, and needs extra maintenance.


    C is not a language with overloads, OOP or generics (beyond the limited _Generic expression). You /always/ need to track your types, and you
    /always/ need to write things in different ways for different types.

    You are under the delusion that there is one "correct" interpretation
    here.  You think that /your/ ideas are the only "obvious" or "proper"
    way to handle things.  In reality, there are dozens of questions that
    could be asked here, including:

    Print is one of the most diverse features among languages. Your
    objections would apply to every language. Many have Print similar to
    mine, for example there might be 'println'; so what newline should be
    used (one of your questions)?

    What I'm showing is a sensible set of defaults.

    No, you are not. What you are showing is what /you/ think are useful
    defaults for /your/ use in /your/ language and /your/ programs. That's different.


    Does there have to be a delimiter between the inputs?  Does it have to
    be comma, or space, or newline?

    Think about it: this is for user input, it needs to fairly forgiving.
    Using character-based input as C prefers is another problem when trying
    to do interactive input. Programs can appear to hang as they silently
    wait for that missing number on the line, while extra numbers screw up
    the following line.


      Are these ignored if there are more
    than one?  Are numbers treated differently in the input?  Would an input >> of "true" be treated as a string or a boolean?  Are there limits to the
    sizes?  How are errors in the input, such as end-of-file or ctrl-C
    treated?  How do you handle non-ASCII strings?

    Yeah, carry on listing so many objections, that in the end the language provides nothing. And requires a million programmers to each reinvent line-based i/o from first principles.

    You are just making excuses why your favourite languages don't provide
    such features.

    What language would that be? And why would you think I'd be making
    excuses? Are you are so utterly obsessed with hating C that you think
    the world is split into those like you that hate it, and those that
    think it is perfect and use nothing else? Your continued
    misunderstanding and misrepresentation of me is getting quite tedious.

    We all know that /you/ have a favourite language (or two) that you think
    is perfect - you wrote the bloody thing, for your own use according to
    your own needs and preferences. Like any other serious programmer, I
    use different languages at different times according to a range of requirements. And like any other programmer, I know the languages I use
    have their strengths and weaknesses, as well as things that I personally
    like or dislike (without expecting everyone else to agree on those points).

    I don't write "What is your name? Hello <name>" programs - I haven't
    done that since I was ten. But if I did, I wouldn't write it in C - as
    C is a terrible language for handling general input. I guess a rough equivalent to your program, written in my favourite language for such
    tasks, might be :

    a, b, c = input("? ").split()
    print(a, b, c)

    But usually when I have a program that takes input, it's a bit more sophisticated and has string handling (and thus less likely to be in C).


    Should there be spaces between the outputs?  Newlines?  Should the
    newline be a CR, an LF, CR+LF, or platform specific?  What resolution or
    format should be used for the numbers?  If someone had entered "0x2c"
    for one of the inputs, is that a string or a number - and if it is a
    number, should it be printed in hex or in decimal?

    Usually such input is not language source code, it might be something
    like a config file or maybe a log or transaction file.

    If your requirements demand a full-blown language tokeniser, then you're doing it wrong; you don't parse source code using a Read statement!

    Should the output go to the "standard out" stream, assuming that is
    supported by the language and the OS?

    It goes to the console unless the program specifies otherwise; that's
    pretty standard.

    The "standard error" stream?

    stderror is an invention of C (or Unix, one of those) and is actually
    quite difficult to make use of outside that language.

    All things that C doesn't have.

    Only you are arguing about C here - only you seem to imagine people
    think it is perfect.  It is far and away the most successful programming
    language, massively used and massively popular,

    Great. That means there is still a place for a systems language at this crude, lower level.

    It also means there is room for alternatives. Even if it means the alternative is something that is implemented on top of C because all the tools are in place.


    There certainly is scope for low-level systems languages. Yours is not
    going to win many hearts - nor is any alternative whose only claim to
    fame is that it matches one person's personal preferences. If someone
    wants to make a low-level systems language that people will want to use,
    it has to do things they can't do in existing languages. It really is
    not that difficult to understand.

    You are clearly saying, don't bother creating an alternative to C unless >>> it actually does something different.

    Yes.  Surely that is obvious?  There is no point in re-inventing the
    same wheel everyone else already uses - you have to bring something new
    to the table.

    I invented /my/ first wheel because I didn't have any!

    Then I found that my wheels was smaller, simpler, faster and generally a better fit than other solutions.

    I disagreed: you CAN have an alternative that, while it does the same
    things, can achieve that differently.

    No one will use it.  So what's the point?

    /I/ will use it. And I will get a kick out of using it. After all not
    many get to use their own languages for 100% of their work.

    Should I change the
    tiny, cheap microcontrollers we use to embedded Windows systems as that
    is the only target you support?

    Mine aren't general purpose in the sense that they are for my own use,
    and they target whatever hardware I happen to be using, which currently
    is Win64. So I'm not here to flog my language. Only discussing portable ideas.

    However previous versions have targetted:

        CPU      Size   OS

       PDP10   36-bit   TOPS10(?) (Not my lang, but first time self-hosted)      Z80    8-bit   None
         Z80            OS/M (CP/M ripoff)
         Z80            (PCW)
     8088/86   16-bit   MSDOS (plus None for some projects)
       80386   32-bit   MSDOS/Windows
         x64   64-bit   Windows (current)
        (C32)  32-bit   Windows/Linux, x86/ARM32 (Versions with C targets)     (C64)  64-bit   Windows/Linux, x64/ARM64

    If I wanted, I could adapt my language to a small device, but it would
    have to work as a cross-compiler, and I'd need a minimum spec.

    BTW, would any of C#, Java, D, Rust, Zig, Odin, Go (or Algol68) work on
    those microcontrollers of yours?


    D will work on many, and perhaps Rust or Go - I haven't checked. C++,
    Ada and Forth are definitely fine. Micropython and Lua can work on
    bigger microcontrollers. But the point is simply that your language is
    not a contender - not what other languages are contenders.


    For C, I have the standards documents
    and reference sites, and compilers and libraries that follow these
    specifications, and an endless supply of knowledgeable users for help,
    advice, or hire - for your language, we have one guy off the internet
    who regularly fails to answer simple questions about the language he
    wrote without trying it to see the result.

    Yeah, and on Reddit, there's an endless stream of the same questions
    about C due to all its quirks!


    And there is a lack of questions about /your/ language. That must mean
    your language is obvious, natural and fault-free. Or perhaps there is something else going on here?

    And for you, I bet I could find a choice macro whose output you probably wouldn't be able to guess without trying it out.


    So what? If I knew the syntax of your language, it would probably take
    only a few minutes to write code that you couldn't figure out. Writing incomprehensible code is not difficult. Writing /comprehensible/ code
    is usually not that difficult either - most people who write
    incomprehensible C macros would write illegible code no matter what the language. (This is a different thing from writing code that uses
    advanced features of a language that are hard for newcomers to understand.)


    So, again, what is the point of a language that is roughly like C but
    with a few technical improvements and perhaps a nicer syntax (in some
    people's opinion) ?

    Well, the language exists. It is a joy to use. It is easy to navigate.
    It is easy to type. It has modules. You don't need declarations, just definitions. It has out-of-order everything. It fixes 100 annoyances of
    C. It provides a whole-program compiler. It builds programs very
    quickly. It has a self-contained one-file implementation. It has a
    companion scripting language in the same syntax and with higher level
    types.

    So, I should just delete a language I've used for 40 years and just code
    in C with all those brackets, braces and semicolons like every other
    palooka who needs to use an off-the-shelf language?

    I should sell my familiar, comfortable car and drive that Model T? It
    would be more like getting the bus; I'd rather walk!

    I hope you were not suggesting that /your/ language is somehow more
    modern than C!
    Actually it is. Why, in what way is C (the language, not all those shiny
    new IDEs), more modern than the language I have?



    A language is its ecosystem - specifications, references,
    implementations, tools, code, users, knowledge. C has its history
    stretching way back, but the way modern C is written is not the way it
    was written long ago. But as someone who insists on throwing out much
    of the language - the bits designed to make it easier to write clear,
    flexible and maintainable code - you might have missed that.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Tue Nov 30 15:22:42 2021
    On 30/11/2021 13:58, David Brown wrote:
    On 29/11/2021 23:40, Bart wrote:

    You are a product of your environment (this is not a criticism, it is
    human nature). You whole experience, especially in anything computer-related, has been in ASCII.

    That's not true. My commercial apps worked in Dutch, German and French
    as well as English. Some users worked also from a digitizer [2D input
    device] and I had to devise a keyboard layout that could cope with the
    needs of those languages.

    (My first product that was sold, did quite well in Norway in the
    mid-80s. It wasn't internationalised then, but Norwegians seemed to
    manage with English.)

    A large proportion of it has been
    from a time when computers couldn't do anything else, and all the rest
    has been using English-language systems with keyboards that are simple
    ASCII only. You've never been able to type "naïve" or "café" with the correct English-language spelling, without jumping through hoops with a "character map" program. You've (almost) never used a command line
    terminal that can work with non-ASCII characters.

    See above. Why do you make these silly assumptions? I happen to live in
    the UK, largely read and write English, and write software for my own
    use, so have little need for internationalisation, which is also now
    much harder with Unicode compared with dedicated 8-bit character sets,
    so I don't bother about it.

    This means that for you, working with case-insensitive strings is easy.
    It's a bit of extra effort in the code, but not much. And it doesn't
    cause confusion or annoyance. In the wider world, however, it is a very different matter - it is either language-specific and potentially complicated, or if it is supposed to support multiple languages it is /extremely/ complicated. And that means it is usually wrong.

    Yet as you explained elsewhere, case-insensitivy IS used in many
    situations. Eg. in Google searches, which would make life difficult
    otherwise.

    For the billion or two of us who use the Roman alphabet (and some others
    like Greek) there presumably are ways to normalise case. If so, why
    can't I employ that in a programming language where keywords and
    identifiers are anyway limited to A-Z and a-z?

    Instead people try to argue that I mustn't have a conversion between A-Z
    and a-z because one letter in the Turkish alphabet doesn't have an
    equivalent in the opposite case!

    But for commands, file names, program identifiers? Why would you want
    them to be case insensitive? I mean, I agree that you don't want
    commands "Copy", "copy" and "COPY" that all do different things. But
    given a command "copy", why would you ever want to type "COPY"?

    Why? Because you may have forgotten the caps lock on!

    In Unix, if you want to do 'cp ABC def', isn't a nuisance having to keep switching caps? And suppose you forget the second switch and type:

    cp ABC DEF

    but don't notice. Now you try and do something with file def, and it
    doesn't work; what the hell happened to that copy!

    It's ridiculous. Windows retains the case of your filenames, but will
    match regardless of case. So you can still use 'def', but you might
    still have to rename if it's that important that it's DEF.

    (I've spent a year or two doing telephone support for non-technical users.

    This involved walking them through typing things on their machine, which
    of course I couldn't see. Imagine if what they typed had to be exactly
    the right case. Or they'd created a file as ABC and were trying to use
    it as abc, but I wouldn't know that.)


    Given
    a variable "noOfWhatsits", what is the benefit of letting "NOofwhatSitS"
    mean the same?

    You've got the wrong end of the stick. The purpose of case-insensitivity
    isn't so you use a different version of noofwhatsits at each instance (I
    think there are 4096 variants); it's so that IT DOESN'T MATTER.

    Was it this camelCase or that CamelCase or CamelCase or Camelcase? It
    doesn't matter; choose your own preferences and stick with.

    Mine is to use allow-lower-case, with ALL-CAPS used for temporary debug
    code.

    For example I may import GetStdHandle, but I will use getstdhandle
    without needing to remember the exact capitalisation.

    While other people who've used my languages liked to capitalise the
    first letter of keywords, or apply camelcase to my flat function names.

    If a language (or project) uses a convention that types start with a
    capital and variables start with a small letter, then this is perfectly clear.

    Sure. Except when the name clashes.

    C is not a language with overloads, OOP or generics (beyond the limited _Generic expression). You /always/ need to track your types, and you /always/ need to write things in different ways for different types.

    Not true. You can write a=b, a+b, a==b without needing to know exact
    types, other than they are valid for those ops.

    I just extend it to tostr(a) (used by Print).

    I don't write "What is your name? Hello <name>" programs - I haven't
    done that since I was ten.

    You don't write programs that read or write text files?


    But if I did, I wouldn't write it in C - as
    C is a terrible language for handling general input.

    OK, you agree with me then! My example works fine on my static language.

    I guess a rough
    equivalent to your program, written in my favourite language for such
    tasks, might be :

    a, b, c = input("? ").split()
    print(a, b, c)

    Not a bad attempt, but it's not great. If I try this version:

    a, b, c, d = input("? ").split()

    for x in (a,b,c,d):
    print(x, type(x))

    Then for input of '123 45.67 abc "def"` it displays

    123 <class 'str'>
    45.67 <class 'str'>
    abc <class 'str'>
    "def" <class 'str'>

    They're all strings! And the last still has its quotes.

    If I do '123,45.67,abc,"def"', it goes wrong.

    If I do '123, 45.67, abc, "def"', then the first 3 items retain that
    trailing comma!

    If I do "def ghi" instead of "def" it goes wrong (it reads a 5th item
    'ghi"'.

    If I do just '123' it goes wrong; it must be exactly 4 items.

    So it's fragile. It needs a lot of work to make robust.

    The equivalent program in my dynamic language is this:

    print "?"
    readln a,b,c,d

    for x in (a,b,c,d) do
    fprintln "# <type #>", x, x.type
    end

    Then input of '123 45.67 abc "def"' shows:

    123 <type int>
    45.670000 <type real>
    abc <type string>
    def <type string>

    The numbers are actual numbers! "def" has lost its quote.

    If I do '123,45.67,abc,"def"' it still works.

    If I do '123, 45.67, abc, "def"' it still works.

    If I do "def ghi" for the last, it shows:

    def ghi <type string>

    (So I can have embedded spaces, commas etc in one item)

    If I do only '123', it shows:

    123 <type int>
    <type string>
    <type string>
    <type string>

    Missing items are read as "", unless I tell it what type:

    readln a:"i", b:"i", c:"i", d:"i"

    Then '123' returns:

    123 <type int>
    0 <type int>
    0 <type int>
    0 <type int>

    Extra items are ignored; on Python it would go wrong.

    But, let me guess, this cuts no ice at all.

    All I can say is, I find it jolly useful, and I have used this kind of
    thing for years.

    If anything doesn't have it or doesn't it, then it's their loss.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Tue Nov 30 20:36:55 2021
    On 30/11/2021 13:58, David Brown wrote:
    On 29/11/2021 23:40, Bart wrote:

    But for commands, file names, program identifiers? Why would you want
    them to be case insensitive? I mean, I agree that you don't want
    commands "Copy", "copy" and "COPY" that all do different things. But
    given a command "copy", why would you ever want to type "COPY" ? Given
    a variable "noofwhatsits", what is the benefit of letting "noofwhatsits"
    mean the same?

    I've normalised both of your 'noofwhatsits' to have the same
    capitalisation, ie. none.

    Can you remember what the original was?

    No? That's the probem.

    The thing is, I can remember words and phrases, but I easily forget capitalisation, and underscores, another thing I avoid.

    Yet one or two languages make underscores non-significant; a bit like
    making letter case for A-Z/a-z non-significant. (And exactly like making underscores in numeric literals non-significant.)

    And also, a bit like Algol68 making white space in identifiers
    non-significant: abc, def and abc def are three disinct identifiers;
    abcdef, abc def and a b c d e f are the same one.

    So, my treatment of capitalisation is like Nim's(?) underlines, and
    Algol68's embedded spaces; it is an optional style that can be used to
    enhance readability, or enforce naming guidelines.

    No one suggests that Nim users will spend their time writing umpteen
    variations of the same name by playing with "_"; or whether Algol68
    users will do the same with spaces.

    BTW, C also is also case-insensitive in a few areas:

    X ABCDEF P Any mix can be used in hex literals
    x abcdef p

    E/e Exponents
    U/u L/l Numeric suffix

    A bit radical of it to let the user choose between upper and lower case,
    and let that make no difference!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Tue Nov 30 22:44:46 2021
    On 30/11/2021 21:36, Bart wrote:
    On 30/11/2021 13:58, David Brown wrote:
    On 29/11/2021 23:40, Bart wrote:

    But for commands, file names, program identifiers?  Why would you want
    them to be case insensitive?  I mean, I agree that you don't want
    commands "Copy", "copy" and "COPY" that all do different things.  But
    given a command "copy", why would you ever want to type "COPY" ?  Given
    a variable "noofwhatsits", what is the benefit of letting "noofwhatsits"
    mean the same?

    I've normalised both of your 'noofwhatsits' to have the same
    capitalisation, ie. none.

    Can you remember what the original was?

    Yes. noOfWhatsits.


    No? That's the probem.

    No problem. The capitalisation was part of the identifier, and done intentionally. Perhaps you think that when a language is
    case-sensitive, people pick their capitalisations randomly?


    The thing is, I can remember words and phrases, but I easily forget capitalisation, and underscores, another thing I avoid.


    Eh, okay. I find that hard to relate to - most people prefer some
    indication of words in multi-word identifiers, and the two most common techniques are underscore_between_words and snakeCase.

    Yet one or two languages make underscores non-significant; a bit like
    making letter case for A-Z/a-z non-significant. (And exactly like making underscores in numeric literals non-significant.)

    There are some languages with odd rules, yes. TeX and LaTeX do not
    allow digits in identifiers (so "x3" is "x 3", or, depending on the
    context, "x{3}"). MetaPost and Metafont (note the significance of the capitalisation in these names) consider "x3" as though x is an array,
    with "x3" being the element you might normally think of as "x[3]".


    And also, a bit like Algol68 making white space in identifiers non-significant:  abc, def and abc def are three disinct identifiers; abcdef, abc def and a b c d e f are the same one.

    So, my treatment of capitalisation is like Nim's(?) underlines, and
    Algol68's embedded spaces; it is an optional style that can be used to enhance readability, or enforce naming guidelines.

    No one suggests that Nim users will spend their time writing umpteen variations of the same name by playing with "_"; or whether Algol68
    users will do the same with spaces.

    In the same way, no one using a case-sensitive language spends their
    time making mixed-up case identifiers just to cause confusion.

    And having worked with badly written code in case-insensitive languages
    (such as Pascal), I can tell you it is /seriously/ confusing when the
    same identifier is cased in different ways. I count that as much worse
    than having different identifiers distinguished only by case.
    (Especially if the cases are used for a convention or style, such as
    initial capitals for types.)

    It doesn't matter what restrictions you make on the identifiers you can
    use - there will always be people who make a mess with it. They will
    spell things inconsistently, or call all their variables "temp1, temp2,
    temp3" (that's particularly common in languages that encourage declaring
    all variables at the start of a function, rather than having decent
    scoping and mixing of statements and variable declarations). You can't
    force good style by restricting how people can write good code.


    BTW, C also is also case-insensitive in a few areas:

      X ABCDEF P   Any mix can be used in hex literals
      x abcdef p

      E/e          Exponents
      U/u L/l      Numeric suffix

    A bit radical of it to let the user choose between upper and lower case,
    and let that make no difference!

    Those are not identifiers.

    I'd personally have been happy to stick to lower case only here, at
    least for the "x" and "e" (I'd rather not have numeric suffixes at all).
    But I didn't design C, I just use it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Tue Nov 30 23:52:59 2021
    On 30/11/2021 21:44, David Brown wrote:
    On 30/11/2021 21:36, Bart wrote:
    On 30/11/2021 13:58, David Brown wrote:
    On 29/11/2021 23:40, Bart wrote:

    But for commands, file names, program identifiers?  Why would you want
    them to be case insensitive?  I mean, I agree that you don't want
    commands "Copy", "copy" and "COPY" that all do different things.  But
    given a command "copy", why would you ever want to type "COPY" ?  Given >>> a variable "noofwhatsits", what is the benefit of letting "noofwhatsits" >>> mean the same?

    I've normalised both of your 'noofwhatsits' to have the same
    capitalisation, ie. none.

    Can you remember what the original was?

    Yes. noOfWhatsits.


    No? That's the probem.

    No problem. The capitalisation was part of the identifier, and done intentionally. Perhaps you think that when a language is
    case-sensitive, people pick their capitalisations randomly?

    The Windows API contains 10,000 functions with specific capitalisations.

    In the same way, no one using a case-sensitive language spends their
    time making mixed-up case identifiers just to cause confusion.

    Didn't I give you an example? When I use my tool to convert C headers to
    my syntax, I need to go through and fix all the clashes.

    And having worked with badly written code in case-insensitive languages
    (such as Pascal), I can tell you it is /seriously/ confusing when the
    same identifier is cased in different ways.

    And you don't find other people's C code confusing at all?

    I've just done an interesting experiment on two of my programs: I took
    the source code and a 100% lower case version and 100% upper case.

    One program still worked with either version. The other worked after
    tweaking one line to do with char constants.

    I then tried the same experiment with a C version of the same program;
    both versions failed, one on line 1, the other on line 4.

    So, which version was more resilient to changes of case?

    The experiment shows that you can much more easily refactor the case-insensitive language to use consistent capitalisation in a style
    that you prefer, than case-sensitive.

    I just have my preferences and you have yours.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Wed Dec 1 09:15:17 2021
    On 01/12/2021 00:52, Bart wrote:
    On 30/11/2021 21:44, David Brown wrote:
    On 30/11/2021 21:36, Bart wrote:
    On 30/11/2021 13:58, David Brown wrote:
    On 29/11/2021 23:40, Bart wrote:

    But for commands, file names, program identifiers?  Why would you want >>>> them to be case insensitive?  I mean, I agree that you don't want
    commands "Copy", "copy" and "COPY" that all do different things.  But >>>> given a command "copy", why would you ever want to type "COPY" ?  Given >>>> a variable "noofwhatsits", what is the benefit of letting
    "noofwhatsits"
    mean the same?

    I've normalised both of your 'noofwhatsits' to have the same
    capitalisation, ie. none.

    Can you remember what the original was?

    Yes.  noOfWhatsits.


    No? That's the probem.

    No problem.  The capitalisation was part of the identifier, and done
    intentionally.  Perhaps you think that when a language is
    case-sensitive, people pick their capitalisations randomly?

    The Windows API contains 10,000 functions with specific capitalisations.

    If you say so (I've avoided it). I assume that they have had good
    reasons for the capitalisations they picked (though with code that has developed over such a long time, it can be hard to keep consistency).


    In the same way, no one using a case-sensitive language spends their
    time making mixed-up case identifiers just to cause confusion.

    Didn't I give you an example?

    No. You gave an example of when capitalisation was used appropriately
    and helpfully to add meaning to the code.

    The fact that /you/ seem to find all capitalisation confusing does not
    mean that the code authors wrote it specifically to cause confusion.

    When I use my tool to convert C headers to
    my syntax, I need to go through and fix all the clashes.

    Do you /really/ expect sympathy for that? Honestly?

    Every programming language has its idiosyncrasies, rules, and semantics.
    There are always differences - some big, some small. If you want to
    translate from one language to another, you have to take those into
    account. Just as you cannot copy blindly from your language to C when
    the arithmetic semantics are different, you cannot copy blindly from C
    to your language if your identifiers are more restricted.

    This happens all the time in language wrappers and interface generators,
    as well as in transcompilers. When the swig folks made a tool for
    generating interfaces to C++ code in Python (amongst the many
    combinations they support), they had to find a way to automate
    generation of functions or methods that are distinguished in C++ by
    overloads, as Python does not support these. And if they generate for a case-insensitive language like Pascal or Ada, they must handle case
    translation too.

    /You/ wrote your language, you designed it, you put in its limitations
    and restrictions, you picked its types and semantics. That's fine,
    that's your choice. But you can't blame other languages because they
    did something different! You can't expect anyone to feel sorry for you
    here - even if they happen to prefer case-insensitive languages themselves.


    And having worked with badly written code in case-insensitive languages
    (such as Pascal), I can tell you it is /seriously/ confusing when the
    same identifier is cased in different ways.

    And you don't find other people's C code confusing at all?


    As I said, people can (and do) write bad code in all languages, all
    styles, and regardless of identifier rules. Yes, I have seen lots of
    horrible, confusing or hard to comprehend C code. No, case sensitivity
    was not an issue for confusion, though badly capitalised identifiers can
    make code ugly. Inconsistent spelling causes an order of magnitude more annoyance than poor capitalisation.

    I've just done an interesting experiment on two of my programs: I took
    the source code and a 100% lower case version and 100% upper case.

    One program still worked with either version. The other worked after
    tweaking one line to do with char constants.

    I then tried the same experiment with a C version of the same program;
    both versions failed, one on line 1, the other on line 4.

    So, which version was more resilient to changes of case?

    If you find such a completely meaningless experiment interesting, then
    go ahead - knock yourself out. Everyone else knows that C is a
    case-sensitive language and would not expect changing everything to
    upper or lower case to work any more than changing all vowels to "e".


    The experiment shows that you can much more easily refactor the case-insensitive language to use consistent capitalisation in a style
    that you prefer, than case-sensitive.

    No, it does nothing of the sort. We already know that with a
    case-insensitive language, you can change the capitalisation at will, so
    the "experiment" does not show that. You made no attempt to refactor
    (perhaps you don't know the meaning of that word?) any programs to a
    consistent capitalisation style, so your "experiment" shows nothing there.


    I just have my preferences and you have yours.


    Yes, and that's fine. But drop the delusion that your unusual
    collection of personal opinions is the absolute truth of how programming languages should be.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Wed Dec 1 10:44:12 2021
    On 01/12/2021 08:15, David Brown wrote:
    On 01/12/2021 00:52, Bart wrote:

    I just have my preferences and you have yours.


    Yes, and that's fine. But drop the delusion that your unusual
    collection of personal opinions is the absolute truth of how programming languages should be.


    Here's a summary of what I've been talking about:

    C (eg) Me

    Case-sensitive Yes No
    0-based Yes No (both 1-based and N-based)
    Braces Yes No (keyword block delimiters)
    Library Read/Print Yes No (read/print *statements*)
    Char-based text i/o Yes No (line-oriented i/o)
    Millions of ";" Yes No (line-oriented source)


    It's just struck me that all the languages corresponding to the
    left-hand column are generally more rigid and inflexible**.

    The ones having attributes from the right are more forgiving, more
    tolerant, and therefore more user-friendly. That would be a desirable
    attribute of a scripting language.

    Since I develop both a compiled and scripting language which have the
    same syntax, it's natural they should share the same attributes.

    (** Except that when it comes to C, compilers for it tend to be too lax
    in unsafe ways. You've got to get that semicolon just right, but never
    mind that you've missed out a return statement in that non-void function!)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Wed Dec 1 13:12:30 2021
    On 01/12/2021 11:44, Bart wrote:
    On 01/12/2021 08:15, David Brown wrote:
    On 01/12/2021 00:52, Bart wrote:

    I just have my preferences and you have yours.


    Yes, and that's fine.  But drop the delusion that your unusual
    collection of personal opinions is the absolute truth of how programming
    languages should be.


    Here's a summary of what I've been talking about:

                           C (eg)      Me

      Case-sensitive       Yes         No
      0-based              Yes         No (both 1-based and N-based)
      Braces               Yes         No (keyword block delimiters)
      Library Read/Print   Yes         No (read/print *statements*)
      Char-based text i/o  Yes         No (line-oriented i/o)
      Millions of ";"      Yes         No (line-oriented source)


    It's just struck me that all the languages corresponding to the
    left-hand column are generally more rigid and inflexible**.

    The ones having attributes from the right are more forgiving, more
    tolerant, and therefore more user-friendly. That would be a desirable attribute of a scripting language.


    Words like "flexible" are of questionable value - they mean different
    things to different people. To me, C (and other languages with similar attributes to those you list here - C is an example, nothing more) is
    more flexible. Being case-sensitive is more flexible than
    case-insensitive because it lets you choose capitalisation that conveys
    more information to the user. Having 0-based arrays is more flexible
    than 1-based arrays because it is easier to work with more complex
    structures. (Allowing a choice of starting values, and other indexing
    types, is much more flexible.) Use of statement terminators or
    separators (C has statement terminators, Pascal has statement
    separators) makes the language more flexible because your source code
    layout can match the structure of the code you want to express in the
    way you want to write it, rather than being forced into lines.

    (I don't think the other points on your list affect "flexibility".)


    Then there is the question of "tolerance" and "forgiving", and whether
    that makes a language "user friendly". Here I can accept that your
    language may be more "tolerant" and "forgiving", but that's based
    entirely on your judgement - nothing on the list here is, IMHO, a matter
    of "tolerance".

    But having long experience with more static and rigid languages such as
    C and C++, and also with more tolerant and dynamic languages such as
    Python, I think would be wrong to say one is more "user-friendly" than
    the other. It is better to say that they are more suited for different
    tasks. Python is much more user-friendly for some case, C for other
    cases. A key point here is the dynamic nature of Python - it lets you
    write code at a higher level (user-friendly), but can't do the kind of compile-time error checking and control that is possible in C (making it user-unfriendly).


    Since I develop both a compiled and scripting language which have the
    same syntax, it's natural they should share the same attributes.


    I disagree. They might share some aspects, but you have different uses
    and different needs for a compiled language and a scripting language (otherwise, why have two languages at all?). Gratuitous similarities
    are no better than gratuitous differences.

    (** Except that when it comes to C, compilers for it tend to be too lax
    in unsafe ways. You've got to get that semicolon just right, but never
    mind that you've missed out a return statement in that non-void function!)


    I agree that a lot of C compilers are far too lenient by default. But
    that's easy to solve - don't use the default settings. Adding "-Wall"
    or "/W" or whatever flag you need, is not rocket science.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Wed Dec 1 13:55:54 2021
    On 01/12/2021 12:12, David Brown wrote:
    On 01/12/2021 11:44, Bart wrote:
    On 01/12/2021 08:15, David Brown wrote:
    On 01/12/2021 00:52, Bart wrote:

    I just have my preferences and you have yours.


    Yes, and that's fine.  But drop the delusion that your unusual
    collection of personal opinions is the absolute truth of how programming >>> languages should be.


    Here's a summary of what I've been talking about:

                           C (eg)      Me

      Case-sensitive       Yes         No
      0-based              Yes         No (both 1-based and N-based)
      Braces               Yes         No (keyword block delimiters)
      Library Read/Print   Yes         No (read/print *statements*) >>   Char-based text i/o  Yes         No (line-oriented i/o)
      Millions of ";"      Yes         No (line-oriented source) >>

    It's just struck me that all the languages corresponding to the
    left-hand column are generally more rigid and inflexible**.

    The ones having attributes from the right are more forgiving, more
    tolerant, and therefore more user-friendly. That would be a desirable
    attribute of a scripting language.


    Words like "flexible" are of questionable value - they mean different
    things to different people. To me, C (and other languages with similar attributes to those you list here - C is an example, nothing more) is
    more flexible. Being case-sensitive is more flexible than
    case-insensitive because it lets you choose capitalisation that conveys
    more information to the user.

    You can use exactly the same capitalisaion, say of "AbcDef", in case-insensitive syntax. But it's optional, so more flexible.

    Having 0-based arrays is more flexible
    than 1-based arrays because it is easier to work with more complex structures.

    I'd dispute that, but I won't go into it, since I also do N-based which /includes/ 0-based. More flexible.

    (Allowing a choice of starting values, and other indexing
    types, is much more flexible.)

    Yeah...


    Use of statement terminators or
    separators (C has statement terminators, Pascal has statement
    separators) makes the language more flexible because your source code
    layout can match the structure of the code you want to express in the
    way you want to write it, rather than being forced into lines.

    Actually the brace thing is not that much more flexible one way or
    another. But my terminators do provide a choice, for example 'end', 'end
    if', 'endif' or 'fi' can terminate an if-statement. 'end' can terminate
    any block; the others must match the statement.

    My users liked using 'End' and other camelcase:

    Proc RuimOP =
    For I = 1 To NidTot Do
    DrawItem(NiewId[I],0)
    DeleteItem(NiewId[I])
    End
    End

    Their choice.


    Then there is the question of "tolerance" and "forgiving", and whether
    that makes a language "user friendly". Here I can accept that your
    language may be more "tolerant" and "forgiving", but that's based
    entirely on your judgement - nothing on the list here is, IMHO, a matter
    of "tolerance".

    I'm tolerant of semicolons; I use them as separators, but 99% don't need
    to be typed as they coincide with end-of-line.

    And extra semicolons are harmless; in C they can have dramatic consequences.

    If you're reading 3 numbers from ONE line of input using scanf, then
    that line needs to have exactly 3 numbers, or it'll go wrong. My
    line-based Read can tolerate fewer or more numbers on that line so much
    better for interactive input.

    If do 'print a,b,c', then change the types of those expressions, nothing
    needs to change in that statement. I can also copy and paste that line
    to print the local a,b,c expressions elsewhere. In C you have to rewrite
    the format string.



    Since I develop both a compiled and scripting language which have the
    same syntax, it's natural they should share the same attributes.


    I disagree. They might share some aspects, but you have different uses
    and different needs for a compiled language and a scripting language (otherwise, why have two languages at all?).

    Why shouldn't they have the same syntax? Or near enough the same (one
    will need more type annotations and so on).


    This is fibonacci in one language:

    function fib(int n)int=
    if n<3 then
    return 1
    else
    return fib(n-1)+fib(n-2)
    fi
    end

    And in the other:

    function fib(n)=
    if n<3 then
    return 1
    else
    return fib(n-1)+fib(n-2)
    fi
    end

    (This demonstrates one reason why I like to have declarations out of the
    way of the main code: the body of the function can be more easily ported
    to the other language; it keeps the code clean.)

    This is the driver code for a test program:

    for i to 36 do
    println i,fib(i)
    od

    which works in either language (except that the static language will go
    beyond 36!).

    (The static language needs it wrapped in a function; it will still
    worked as dynamic, but there it is optional.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Mon Dec 6 18:58:30 2021
    On 29/11/2021 15:19, David Brown wrote:
    On 29/11/2021 14:06, Bart wrote:

    This is just pure jealousy. Show me the C code needed to do the
    equivalent of this (without knowing the types of a, b, c other than they
    are numeric):

       print "?"
       readln a, b, c
       println a, b, c

    You are under the delusion that there is one "correct" interpretation
    here. You think that /your/ ideas are the only "obvious" or "proper"
    way to handle things. In reality, there are dozens of questions that
    could be asked here, including:

    Does there have to be a delimiter between the inputs? Does it have to
    be comma, or space, or newline? Are these ignored if there are more
    than one? Are numbers treated differently in the input? Would an input
    of "true" be treated as a string or a boolean? Are there limits to the sizes?


    This also comes up in command-line parameters to shell commands.

    There you can also ask all your questions. The difference is that rather
    than not make itemised parameters available at all (eg. as a single
    string), it decides on sensible defaults.

    But they are still not as sensible as what I use for Read. In Windows or
    Linux, a command like:

    prog a b c

    returns those three args as 3 string parameters "a", "b", c". This:

    prog a,b,c

    returns one arg of "a,b,c". Here:

    prog a, b, c

    it results in 3 parameters "a," "b," "c" (notice the trailing commas).

    prog "a b" c

    gives 2 params "a b" and "c", without the quotes. So better than your
    Python example. Here however:

    prog *.c

    Windows give one parameter "*.c", Linux gives *240* parameters. When I
    try this:

    prog *.c -option

    Windows gives me "*.c" and "-option", but Linux now gives 241
    parameters; information has been lost.

    Anyway, imagine what a nuisance it would have been in C's main() was
    defined like this:

    int main(char* cmdline) {}

    Just one string that you had to parse yourself.

    When C decides to do something that is convenient, then that is great.
    If I decide to do that, you have 101 reasons why it is a terrible idea.

    NIH syndrome?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Tue Dec 7 08:14:53 2021
    On 06/12/2021 19:58, Bart wrote:
    On 29/11/2021 15:19, David Brown wrote:
    On 29/11/2021 14:06, Bart wrote:

    This is just pure jealousy. Show me the C code needed to do the
    equivalent of this (without knowing the types of a, b, c other than they >>> are numeric):

        print "?"
        readln a, b, c
        println a, b, c

    You are under the delusion that there is one "correct" interpretation
    here.  You think that /your/ ideas are the only "obvious" or "proper"
    way to handle things.  In reality, there are dozens of questions that
    could be asked here, including:

    Does there have to be a delimiter between the inputs?  Does it have to
    be comma, or space, or newline?  Are these ignored if there are more
    than one?  Are numbers treated differently in the input?  Would an input >> of "true" be treated as a string or a boolean?  Are there limits to the
    sizes?


    This also comes up in command-line parameters to shell commands.

    There you can also ask all your questions. The difference is that rather
    than not make itemised parameters available at all (eg. as a single
    string), it decides on sensible defaults.

    But they are still not as sensible as what I use for Read. In Windows or Linux, a command like:

       prog a b c

    returns those three args as 3 string parameters "a", "b", c". This:

       prog a,b,c

    returns one arg of "a,b,c". Here:

       prog a, b, c

    it results in 3 parameters "a," "b," "c" (notice the trailing commas).

       prog "a b" c

    gives 2 params "a b" and "c", without the quotes. So better than your
    Python example. Here however:

       prog *.c

    Windows give one parameter "*.c", Linux gives *240* parameters. When I
    try this:

       prog *.c -option

    Windows gives me "*.c" and "-option", but Linux now gives 241
    parameters; information has been lost.


    Information has not been "lost". In the *nix world, there is a strong tradition for avoiding duplication of work. The shell knows how to
    expand wildcards, so it does that job - letting "prog" concentrate on
    its own job. It makes it possible to keep things simpler and more
    consistent (while retaining the flexibility to be overly complex and inconsistent - I'm not saying everything in the *nix world is perfect).
    In the Windows world, every program has to re-implement its own wheels
    from scratch, every time.

    If you want to pass "*.c" as an option in *nix, write :

    prog \*.c -option

    or

    prog "*.c" -option

    It's quite simple.

    Anyway, imagine what a nuisance it would have been in C's main() was
    defined like this:

       int main(char* cmdline) {}

    Just one string that you had to parse yourself.

    When C decides to do something that is convenient, then that is great.
    If I decide to do that, you have 101 reasons why it is a terrible idea.

    NIH syndrome?

    It is not "C" that decides this. It is the OS conventions that is in
    charge of passing information to the start of the program, in
    cooperation with the runtime startup code. It is no coincidence that
    you get the same arguments in argv in C and in sys.argv in Python.

    So the /shell/ is the bit that does the wildcard expansion. It is
    normal for shells on *nix to expand wildcards, and normal for the more
    minimal "command prompt" in DOS and Windows not to expand wildcards.
    But you can have a shell in *nix that does not expand them, and a shell
    in Windows that does.

    It is the /OS/ and the the link-loader that splits command lines into
    parts and passes them to the start of the program, regardless of the
    language of the program.

    Could this have been done differently? Sure. Could it have been done
    better? Other ways might have had some advantages, and some
    disadvantages - better in some ways, less good in others. Is this part
    of the great conspiracy where everything in the computer world is made
    because of C, is worse because of C, and designed that way with the sole intention of annoying Bart? No.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to David Brown on Tue Dec 7 08:54:48 2021
    On 2021-12-07 08:14, David Brown wrote:

    So the /shell/ is the bit that does the wildcard expansion.

    That was a quite a problem back then. The damn thing ran out of memory expanding *'s on i368 25Mz machines we used.

    Could this have been done differently? Sure.

    In a well-designed system you would have a standard system library to
    process the command line in a unified way. UNIX was a mixed bag trying
    and failing to do both. In the end nobody respected any conventions and
    UNIX utilities have totally unpredictable syntax of arguments.

    [ Though I think UNIX missed an opportunity to make it even worse.
    Consider if it not only expanded file lists but also opened the files
    and passed the file descriptors to the process! ]

    Is this part
    of the great conspiracy where everything in the computer world is made because of C, is worse because of C, and designed that way with the sole intention of annoying Bart? No.

    It is much bigger than Bart, the conspiracy, I mean... (:-))

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Tue Dec 7 09:22:48 2021
    On 07/12/2021 07:14, David Brown wrote:
    On 06/12/2021 19:58, Bart wrote:

    NIH syndrome?

    It is not "C" that decides this. It is the OS conventions that is in
    charge of passing information to the start of the program, in
    cooperation with the runtime startup code. It is no coincidence that
    you get the same arguments in argv in C and in sys.argv in Python.

    So the /shell/ is the bit that does the wildcard expansion. It is
    normal for shells on *nix to expand wildcards, and normal for the more minimal "command prompt" in DOS and Windows not to expand wildcards.
    But you can have a shell in *nix that does not expand them, and a shell
    in Windows that does.

    It is the /OS/ and the the link-loader that splits command lines into
    parts and passes them to the start of the program, regardless of the
    language of the program.

    That might be the case on Unix where where the OS and C are so
    intertwined that that you don't know where one ends and the other begins.

    On Windows, that doesn't happen automatically. If you see that behaviour
    then it's due to the language runtime.

    That 'int main(int argc, char** argv)' entry-point doesn't magically happen!

    I need to call __getmainargs() in msvcrt.dll to get those arguments
    expected of C programs. Before I knew about __getmainargs(), I used GetCommandLine() from WinAPI to get the commands as one string, that I
    had to parse myself.

    Could this have been done differently? Sure. Could it have been done better? Other ways might have had some advantages, and some
    disadvantages - better in some ways, less good in others. Is this part
    of the great conspiracy where everything in the computer world is made because of C, is worse because of C, and designed that way with the sole intention of annoying Bart? No.

    I use a more sophisticated version of what happens with command-line
    params (without turning the latter into a whole language like some
    shells), and make it available via Readln on every line of console or
    file input, not just the bit following the command invocation.

    You said that is unworkable. C's argc/argv scheme shows that it can be.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Tue Dec 7 10:57:28 2021
    On 07/12/2021 10:22, Bart wrote:

    That might be the case on Unix where where the OS and C are so
    intertwined that that you don't know where one ends and the other begins.

    No, /you/ don't know where one ends and the other begins - because you
    have worked yourself into such an obsessive hatred of both that you
    refuse to learn anything about either. Please don't judge others by
    your own wilful ignorance - there are countless millions who manage to
    program in C and work with *nix (the two being independent in practice).
    There's nothing wrong with preferring other languages and/or other
    OS's, but your struggles with C and your dislike of *nix are a personal
    matter for you.

    Please, find something new and interesting to post about rather than
    your misconceptions, misunderstandings and FUD about languages and
    systems that you don't like. It would be nice to get back to some
    positive discussions.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Dmitry A. Kazakov on Tue Dec 7 10:44:37 2021
    On 07/12/2021 08:54, Dmitry A. Kazakov wrote:
    On 2021-12-07 08:14, David Brown wrote:

    So the /shell/ is the bit that does the wildcard expansion.

    That was a quite a problem back then. The damn thing ran out of memory expanding *'s on i368 25Mz machines we used.

    With big enough sets of files or command lines, /something/ is going to
    run out of memory! Yes, sometimes command lines on *nix get too long,
    and you have to use something like xargs. On the other side, because
    DOS and Windows don't expand wildcards, the systems were made with much
    shorter limits on the length of command lines which can lead to problems
    with long filenames, lots of files (such as for linking), or lots of
    flags. This is mostly a thing of the past, however, on both *nix and
    Windows.


    Could this have been done differently?  Sure.

    In a well-designed system you would have a standard system library to
    process the command line in a unified way. UNIX was a mixed bag trying
    and failing to do both. In the end nobody respected any conventions and
    UNIX utilities have totally unpredictable syntax of arguments.


    It is far from being "totally unpredictable" - there are conventions
    that are followed by most programs. But these are not enforced in any
    way by *nix.

    [ Though I think UNIX missed an opportunity to make it even worse.
    Consider if it not only expanded file lists but also opened the files
    and passed the file descriptors to the process! ]


    That would not make any sense - command line parameters are not
    necessarily files!

    Is this part
    of the great conspiracy where everything in the computer world is made
    because of C, is worse because of C, and designed that way with the sole
    intention of annoying Bart?  No.

    It is much bigger than Bart, the conspiracy, I mean... (:-))


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Tue Dec 7 10:04:34 2021
    On 07/12/2021 09:44, David Brown wrote:
    On 07/12/2021 08:54, Dmitry A. Kazakov wrote:
    On 2021-12-07 08:14, David Brown wrote:

    So the /shell/ is the bit that does the wildcard expansion.

    That was a quite a problem back then. The damn thing ran out of memory
    expanding *'s on i368 25Mz machines we used.

    With big enough sets of files or command lines, /something/ is going to
    run out of memory!

    But that needn't be the case! Suppose you had a million files in the
    current directory, then input of "*" or "*.*" will try and create
    1000000 strings; something is likely to break.

    On Windows, it will just see one string. If the application actually
    expected to work on multiple files, then with "*", it could iterate over
    the files one by one, without needing to first create a list of them all.


    [ Though I think UNIX missed an opportunity to make it even worse.
    Consider if it not only expanded file lists but also opened the files
    and passed the file descriptors to the process! ]


    That would not make any sense - command line parameters are not
    necessarily files!

    It doesn't make sense anyway: an input like "*" might means something
    specific to an application, but the shell will turn it into an arbirary
    list of strings, or into nothing.

    Even if the app works with files, it can see that "* *" are two items
    (perhaps two sets of files for different purposes), but on Linux, it
    will turn it into one giant list, with duplicate files.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to David Brown on Tue Dec 7 11:14:11 2021
    On 2021-12-07 10:44, David Brown wrote:
    On 07/12/2021 08:54, Dmitry A. Kazakov wrote:
    On 2021-12-07 08:14, David Brown wrote:

    So the /shell/ is the bit that does the wildcard expansion.

    That was a quite a problem back then. The damn thing ran out of memory
    expanding *'s on i368 25Mz machines we used.

    With big enough sets of files or command lines, /something/ is going to
    run out of memory!

    Normally you would just walk the list of files without expanding it in
    the memory.

    [ The secret lore lost to younger generations, that there is no need
    load all document into the memory in order to read or edit it. (:-)) ]

    In a well-designed system you would have a standard system library to
    process the command line in a unified way. UNIX was a mixed bag trying
    and failing to do both. In the end nobody respected any conventions and
    UNIX utilities have totally unpredictable syntax of arguments.

    It is far from being "totally unpredictable" - there are conventions
    that are followed by most programs.

    Like in the case of dd?

    [ Though I think UNIX missed an opportunity to make it even worse.
    Consider if it not only expanded file lists but also opened the files
    and passed the file descriptors to the process! ]

    That would not make any sense - command line parameters are not
    necessarily files!

    They are, unless introduced by a symbol of a key, e.g. /a, -a, --a etc,
    so was the "convention." I never liked it, BTW.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Tue Dec 7 11:56:45 2021
    On 2021-12-07 11:33, Bart wrote:
    On 07/12/2021 09:57, David Brown wrote:

    You said this:

    It is the /OS/ and the the link-loader that splits command lines into parts and passes them to the start of the program, regardless of the language of the program.

    That appears to be incorrect.

    Nope, it is perfectly correct. If external command line parsing happens,
    then it is ultimately the OS that pushes the results to the program. It
    a part of the interface between C's run-time and the OS. Other languages
    do more or less the same, e.g. see Ada RM A.15 The package Command_Line.

    It might be happy coincidence that on Unix, at the entry point to a
    program in any language, the parameter stack happens to contain suitable values of argc and argv, just like you get with C's main(); what a bit
    of luck!

    But I haven't seen that in Windows (or any previous OSes I've used).

    Can you point me to a link which says that Windows does exactly the
    same, or please say you were mistaken.

    https://docs.microsoft.com/en-us/cpp/c-runtime-library/argc-argv-wargv?view=msvc-170

    https://docs.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments?view=msvc-170

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Tue Dec 7 10:33:45 2021
    On 07/12/2021 09:57, David Brown wrote:
    On 07/12/2021 10:22, Bart wrote:

    That might be the case on Unix where where the OS and C are so
    intertwined that that you don't know where one ends and the other begins.

    No, /you/ don't know where one ends and the other begins - because you
    have worked yourself into such an obsessive hatred of both

    Where's the hatred above? I'm still stating what I see.

    that you
    refuse to learn anything about either. Please don't judge others by
    your own wilful ignorance - there are countless millions who manage to program in C and work with *nix (the two being independent in practice).
    There's nothing wrong with preferring other languages and/or other
    OS's, but your struggles with C and your dislike of *nix are a personal matter for you.

    Please, find something new and interesting to post about rather than
    your misconceptions, misunderstandings and FUD about languages and
    systems that you don't like. It would be nice to get back to some
    positive discussions.


    You said this:

    It is the /OS/ and the the link-loader that splits command lines into
    parts and passes them to the start of the program, regardless of the language of the program.

    That appears to be incorrect.

    It might be happy coincidence that on Unix, at the entry point to a
    program in any language, the parameter stack happens to contain suitable
    values of argc and argv, just like you get with C's main(); what a bit
    of luck!

    But I haven't seen that in Windows (or any previous OSes I've used).

    Can you point me to a link which says that Windows does exactly the
    same, or please say you were mistaken.

    /You/ seem to have an obsessive hatred of anything I say or do.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Tue Dec 7 11:40:50 2021
    On 07/12/2021 07:14, David Brown wrote:

    So the /shell/ is the bit that does the wildcard expansion. It is
    normal for shells on *nix to expand wildcards, and normal for the more minimal "command prompt" in DOS and Windows not to expand wildcards.
    But you can have a shell in *nix that does not expand them, and a shell
    in Windows that does.

    This is not quite right either.

    If I take this line that I use to obtain the command params on Windows:

    __getmainargs(&nargs, &args, &env,0, &startupinfo)

    and change that 0 to 1, then I will get expanded wildcards too!

    (And the first message I saw was: "Too many params"! I have a limit of
    128 parameters, which seems more than adeqite for normal shell use, but
    there were 160 matching files. Such expansion is just inappropropriate
    at this point.)

    Anyway, this bit is clearly not a shell.

    Interesting the things you find out when you implement languages instead
    of merely using them.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Tue Dec 7 13:00:11 2021
    On 2021-12-07 12:22, Bart wrote:
    On 07/12/2021 10:56, Dmitry A. Kazakov wrote:
    On 2021-12-07 11:33, Bart wrote:
    On 07/12/2021 09:57, David Brown wrote:

    You said this:

    It is the /OS/ and the the link-loader that splits command lines into >>>  > parts and passes them to the start of the program, regardless of the >>>  > language of the program.

    That appears to be incorrect.

    Nope, it is perfectly correct. If external command line parsing
    happens, then it is ultimately the OS that pushes the results to the
    program.

    On Windows it pushes nothing. Language start-up code needs to do the
    work behinds the scenes so that user-programs can use entry-point
    functions like:

        main(argc, argv)
        WinMain(Hinstance, etc)

    Yes and that was the point. This happens *before* main() is called.

    Yes, it is part of the language. It is NOT the OS as David Brown stated.
    It might be on Unix since Unix and C are so chummy.

    It is specified by the language and fulfilled by the OS. Maybe, you
    meant something like the context where parsing to happen:

    Linux - The caller process
    Windows - The callee process
    xxx - The system kernel, maybe VxWorks would fall into this
    category, I am not sure

    ?

    This is kind of pointless distinction, especially because processes in
    Linux and Windows are very different.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Tue Dec 7 11:22:08 2021
    On 07/12/2021 10:56, Dmitry A. Kazakov wrote:
    On 2021-12-07 11:33, Bart wrote:
    On 07/12/2021 09:57, David Brown wrote:

    You said this:

    It is the /OS/ and the the link-loader that splits command lines into
    parts and passes them to the start of the program, regardless of the
    language of the program.

    That appears to be incorrect.

    Nope, it is perfectly correct. If external command line parsing happens,
    then it is ultimately the OS that pushes the results to the program.

    On Windows it pushes nothing. Language start-up code needs to do the
    work behinds the scenes so that user-programs can use entry-point
    functions like:

    main(argc, argv)
    WinMain(Hinstance, etc)

    It
    a part of the interface between C's run-time and the OS.

    Yes, it is part of the language. It is NOT the OS as David Brown stated.
    It might be on Unix since Unix and C are so chummy.


    Other languages
    do more or less the same, e.g. see Ada RM A.15 The package Command_Line.

    It might be happy coincidence that on Unix, at the entry point to a
    program in any language, the parameter stack happens to contain
    suitable values of argc and argv, just like you get with C's main();
    what a bit of luck!

    But I haven't seen that in Windows (or any previous OSes I've used).

    Can you point me to a link which says that Windows does exactly the
    same, or please say you were mistaken.

    https://docs.microsoft.com/en-us/cpp/c-runtime-library/argc-argv-wargv?view=msvc-170


    https://docs.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments?view=msvc-170

    Those two links are specific to C implementations. So of course they
    will set up the main's arguments for you! That's what I have to do too
    in my C implementation:


    #include <stdio.h>

    int main (int n, char** a) {

    for (int i=1; i<=n; ++i) {
    printf("%d: %s\n",i,*a);
    ++a;
    }
    }

    This generates the asm code below. The user's 'main' function is renamed '.main'.

    A new 'main' function is generated which calls __getmainargs. (There are
    also __argc and __argv exported by msvcrt.dll, but I don't use those.)

    Once obtained, it calls .main() with those new parameters. The program
    thinks the OS put those on the stack; apparently so does everyone else!

    Below this code, is the import list used by a version of this C program compiled with tcc. Tcc also has a startup routine (bigger than mine),
    which loads the arguments that become main's argc/argv. But notice it
    imports __getmainargs too.

    This routine is not the OS. The msvcrt.dll library is not the OS.


    !x64 output for showargs.c
    align 16
    !------------------------------------
    `main::
    sub Dstack,160
    lea D0,[Dstack+8]
    push D0
    sub Dstack,32
    lea D0,[Dstack+196]
    mov [Dstack],D0
    lea D0,[Dstack+184]
    mov [Dstack+8],D0
    lea D0,[Dstack+176]
    mov [Dstack+16],D0
    mov A0,0
    mov [Dstack+24],A0
    mov D10,[Dstack]
    mov D11,[Dstack+8]
    mov D12,[Dstack+16]
    mov D13,[Dstack+24]
    call __getmainargs*
    add Dstack,16
    mov A0,[Dstack+180]
    mov [Dstack],A0
    mov D0,[Dstack+168]
    mov [Dstack+8],D0
    mov D10,[Dstack]
    mov D11,[Dstack+8]
    call .main
    mov A10,A0
    call exit*

    .main::
    push Dframe
    mov Dframe, Dstack
    sub Dstack, 16
    mov [Dframe+16], D10
    mov [Dframe+24], D11
    ! -------------------------------------------------
    mov word32 [Dframe-8], 1
    jmp L4
    L5:
    sub Dstack, 32
    mov D0, [Dframe+24]
    push word64 [D0]
    mov D10, KK1
    mov A11, [Dframe-8]
    pop D12
    call `printf*
    add Dstack, 32
    add word64 [Dframe+24], 8
    L2:
    inc word32 [Dframe-8]
    L4:
    mov A0, [Dframe-8]
    cmp A0, [Dframe+16]
    jle L5
    L3:
    L1:
    ! -------------------------------------------------
    sub Dstack, 32
    mov D10, 0
    call exit*

    !String Table
    segment idata
    align 8
    KK1:db "%d: %s",10,0


    Tcc imports:

    Name: msvcrt.dll
    Import Addr RVA: 2038
    Import: 20d3 0 printf
    Import: 20dc 0 __set_app_type
    Import: 20ed 0 _controlfp
    Import: 20fa 0 __argc
    Import: 2103 0 __argv
    Import: 210c 0 _environ
    Import: 2117 0 __getmainargs
    Import: 2127 0 exit

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Tue Dec 7 13:57:01 2021
    On 07/12/2021 12:00, Dmitry A. Kazakov wrote:
    On 2021-12-07 12:22, Bart wrote:
    On 07/12/2021 10:56, Dmitry A. Kazakov wrote:
    On 2021-12-07 11:33, Bart wrote:
    On 07/12/2021 09:57, David Brown wrote:

    You said this:

    It is the /OS/ and the the link-loader that splits command lines
    into
    parts and passes them to the start of the program, regardless of the >>>>  > language of the program.

    That appears to be incorrect.

    Nope, it is perfectly correct. If external command line parsing
    happens, then it is ultimately the OS that pushes the results to the
    program.

    On Windows it pushes nothing. Language start-up code needs to do the
    work behinds the scenes so that user-programs can use entry-point
    functions like:

         main(argc, argv)
         WinMain(Hinstance, etc)

    Yes and that was the point. This happens *before* main() is called.

    But *after* execution commences at the program's official entry point,
    by the language's startup code.

    What it comes down is that, if you are implementing a language, these
    argc/argv values don't magically appear on the stack, not on Windows.
    The language must arrange for that to happen.


    Yes, it is part of the language. It is NOT the OS as David Brown
    stated. It might be on Unix since Unix and C are so chummy.

    It is specified by the language and fulfilled by the OS.

    The only WinAPI function I know of that gives that info is
    GetCommandLine, which delivers a single string you have to process.

    If you know of a better WinAPI function on Windows, or of some exported
    data from a Windows DLL that provides the same info, or perhaps of some
    data block within the loaded PE image that contains it, then I will use
    that instead.

    Maybe, you
    meant something like the context where parsing to happen:

       Linux   - The caller process
       Windows - The callee process
       xxx     - The system kernel, maybe VxWorks would fall into this category, I am not sure

    ?

    This is kind of pointless distinction, especially because processes in
    Linux and Windows are very different.

    Well, I was responding to this:

    DB:
    It is the /OS/ and the the link-loader that splits command lines into
    parts and passes them to the start of the program, regardless of the language of the program.

    The distinction was important since it is this very process that is
    commonly done on Unix and/or C, which is equivalent to what I do with
    Read on every line. Apparently it's OK when C (and/or Unix) does it on
    the command line, but not OK when a language does it on any input line.

    Correction: when /my/ language does it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Tue Dec 7 16:25:38 2021
    On 2021-12-07 14:57, Bart wrote:
    On 07/12/2021 12:00, Dmitry A. Kazakov wrote:
    On 2021-12-07 12:22, Bart wrote:
    On 07/12/2021 10:56, Dmitry A. Kazakov wrote:
    On 2021-12-07 11:33, Bart wrote:
    On 07/12/2021 09:57, David Brown wrote:

    You said this:

    It is the /OS/ and the the link-loader that splits command lines >>>>> into
    parts and passes them to the start of the program, regardless of >>>>> the
    language of the program.

    That appears to be incorrect.

    Nope, it is perfectly correct. If external command line parsing
    happens, then it is ultimately the OS that pushes the results to the
    program.

    On Windows it pushes nothing. Language start-up code needs to do the
    work behinds the scenes so that user-programs can use entry-point
    functions like:

         main(argc, argv)
         WinMain(Hinstance, etc)

    Yes and that was the point. This happens *before* main() is called.

    But *after* execution commences at the program's official entry point,
    by the language's startup code.

    The official entry point of a C console program is main().

    What it comes down is that, if you are implementing a language, these argc/argv values don't magically appear on the stack, not on Windows.
    The language must arrange for that to happen.

    The linker, there is a switch to instruct the MS linker which CRT to
    link to the executable. E.g. one can link one that skips command line
    parsing altogether.

    The only WinAPI function I know of that gives that info is
    GetCommandLine, which delivers a single string you have to process.

    If you know of a better WinAPI function on Windows, or of some exported
    data from a Windows DLL that provides the same info, or perhaps of some
    data block within the loaded PE image that contains it, then I will use
    that instead.

    CommandLineToArgv[A|W]

    Why do you think it should be physically stored in the process address
    space in the first place? It might be the case for a very primitive OS
    UNIX was when it was developed. These days it could be anywhere. You
    know there is GetCommandLineA and GetCommandLineW, which one is a fake?
    Why do you even care?

    Maybe, you meant something like the context where parsing to happen:

        Linux   - The caller process
        Windows - The callee process
        xxx     - The system kernel, maybe VxWorks would fall into this >> category, I am not sure

    ?

    This is kind of pointless distinction, especially because processes in
    Linux and Windows are very different.

    Well, I was responding to this:

    DB:
    It is the /OS/ and the the link-loader that splits command lines into parts and passes them to the start of the program, regardless of the language of the program.

    The distinction was important since it is this very process that is
    commonly done on Unix and/or C, which is equivalent to what I do with
    Read on every line. Apparently it's OK when C (and/or Unix) does it on
    the command line, but not OK when a language does it on any input line.

    I have no idea what this is supposed to mean.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Tue Dec 7 16:54:46 2021
    On 07/12/2021 15:25, Dmitry A. Kazakov wrote:
    On 2021-12-07 14:57, Bart wrote:

    Yes and that was the point. This happens *before* main() is called.

    But *after* execution commences at the program's official entry point,
    by the language's startup code.

    The official entry point of a C console program is main().

    No. You don't quite understand how it works, that's fine.

    But if your C program's entrypoint point is 'main', and the EXE's
    entrypoint name is also 'main', then this code must be executed within
    main(), by specially injected code.

    Actually, with gcc, it changes the EXE's entry point to something else, probably some injected code if it is not some function in the runtime.
    It then eventually calls the user-code main(), with argc/argv as parameters.

    In any case, fetching the command params is done after the application
    has started running.


    What it comes down is that, if you are implementing a language, these
    argc/argv values don't magically appear on the stack, not on Windows.
    The language must arrange for that to happen.

    The linker, there is a switch to instruct the MS linker which CRT to
    link to the executable. E.g. one can link one that skips command line
    parsing altogether.

    I don't use a linker...

    But if what you say is correct, then this is still code that is within
    the application.

    The only WinAPI function I know of that gives that info is
    GetCommandLine, which delivers a single string you have to process.

    If you know of a better WinAPI function on Windows, or of some
    exported data from a Windows DLL that provides the same info, or
    perhaps of some data block within the loaded PE image that contains
    it, then I will use that instead.

    CommandLineToArgv[A|W]

    This would be the next step /after/ calling GetCommandLine.

    I might use this, but instead I will probably apply my own parsing since
    the C-style processing is not quite up to scratch. For example it would
    be nice to do:

    readln @cmdline, a, b, c

    when a, b, c are numbers, without all the usual palaver of checking argc
    and applying atoi and the rest, or having to employ some library to do
    that simple task.


    Why do you think it should be physically stored in the process address
    space in the first place? It might be the case for a very primitive OS
    UNIX was when it was developed. These days it could be anywhere. You
    know there is GetCommandLineA and GetCommandLineW, which one is a fake?
    Why do you even care?

    Maybe, you meant something like the context where parsing to happen:

        Linux   - The caller process
        Windows - The callee process
        xxx     - The system kernel, maybe VxWorks would fall into this >>> category, I am not sure

    ?

    This is kind of pointless distinction, especially because processes
    in Linux and Windows are very different.

    Well, I was responding to this:

    DB:
    It is the /OS/ and the the link-loader that splits command lines into
    parts and passes them to the start of the program, regardless of the
    language of the program.

    The distinction was important since it is this very process that is
    commonly done on Unix and/or C, which is equivalent to what I do with
    Read on every line. Apparently it's OK when C (and/or Unix) does it on
    the command line, but not OK when a language does it on any input line.

    I have no idea what this is supposed to mean.

    This subthread is about the similarity between:

    - Command line parsing (chopping one input line into separate args)
    - My language's Readln which reads separate items from one input line

    DB said the latter can't possibly work because of too many unknowns. Yet
    it hasn't stopped shells using command line parameters.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Tue Dec 7 20:31:38 2021
    On 2021-12-07 17:54, Bart wrote:

    In any case, fetching the command params is done after the application
    has started running.

    No, the process /= application. There is a lot of things happening
    between creation of a process and the application running on the context
    of that process (or processes).

    What it comes down is that, if you are implementing a language, these
    argc/argv values don't magically appear on the stack, not on Windows.
    The language must arrange for that to happen.

    The linker, there is a switch to instruct the MS linker which CRT to
    link to the executable. E.g. one can link one that skips command line
    parsing altogether.

    I don't use a linker...

    But if what you say is correct, then this is still code that is within
    the application.

    No, it is within the C run-time and furthermore nothing prevents an implementation of the run-time to call to the system kernel and/or other processes and services.

    The only WinAPI function I know of that gives that info is
    GetCommandLine, which delivers a single string you have to process.

    If you know of a better WinAPI function on Windows, or of some
    exported data from a Windows DLL that provides the same info, or
    perhaps of some data block within the loaded PE image that contains
    it, then I will use that instead.

    CommandLineToArgv[A|W]

    This would be the next step /after/ calling GetCommandLine.

    Calling GetCommandLine is no way obligatory and it tells nothing about
    the implementation of. When a Windows process is created, the caller can specify a command line parameter either as an ASCII or as an UTF-16
    encoded string. What happens with that parameter, e.g. if it is
    marshaled to process address space, converted, or whatever else is up to Windows. You are making groundless assumptions about the implementation
    of Windows API. You shall not, as it is a subject of change at any time
    MS finds that appropriate.

    This subthread is about the similarity between:

      - Command line parsing (chopping one input line into separate args)
      - My language's Readln which reads separate items from one input line

    DB said the latter can't possibly work because of too many unknowns. Yet
    it hasn't stopped shells using command line parameters.

    He is right.

    1. If the OS does not impose a specific way of treating parameters,
    there is no safe way to process arguments.

    2. If the OS, as Linux does, requires parameters pre-parsed outside the process, there are again limits to what could be done. E.g. in Linux
    there is no reliable way to get the original command line it simply
    might not exist.

    In both cases there is absolutely no guarantee of any correspondence
    between the "perceived" command line and what the process gets.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to Dmitry A. Kazakov on Tue Dec 7 20:49:47 2021
    On 07/12/2021 19:31, Dmitry A. Kazakov wrote:
    On 2021-12-07 17:54, Bart wrote:

    In any case, fetching the command params is done after the application
    has started running.

    No, the process /= application. There is a lot of things happening
    between creation of a process and the application running on the context
    of that process (or processes).

    OK, it has to load the PE and do a bunch of fixups, but eventually it
    will pass control to the entry point.

    Then, where are the command line parameters to be found?

    I can tell you they are not on the stack, which is where they must be if
    C's main(argc, argv) is to work properly.

    They have to be put there. Windows will not do that. The application's language's startup code must do it.

    That's the bit /I/ write.


    DB said the latter can't possibly work because of too many unknowns.
    Yet it hasn't stopped shells using command line parameters.

    He is right.

    1. If the OS does not impose a specific way of treating parameters,
    there is no safe way to process arguments.

    2. If the OS, as Linux does, requires parameters pre-parsed outside the process, there are again limits to what could be done. E.g. in Linux
    there is no reliable way to get the original command line it simply
    might not exist.

    In both cases there is absolutely no guarantee of any correspondence
    between the "perceived" command line and what the process gets.

    The point is that C, somehow, ended up with a scheme where that one line
    of commands WAS processed into convenient chunks for the application
    work work.

    Despite there being 'too many variables' to work; too many possible ways
    that different programmers might want that command line parsed.

    So, why can't a language also specify a set of defaults for proper
    line-reading routines:

    readln a, b, c

    But I can see that I'm banging my head against a brick wall:

    * No one here is ever going to admit that Bart's Readln statements
    might actually be a good idea, despite C command-line processing
    doing pretty much the same thing.

    * And apparently no one is going to admit that that command-line
    processing is not actually done automatically by Windows; it is up
    to the startup code of a language implementation to get it sorted

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Bart on Tue Dec 7 22:36:27 2021
    On 2021-12-07 21:49, Bart wrote:
    On 07/12/2021 19:31, Dmitry A. Kazakov wrote:
    On 2021-12-07 17:54, Bart wrote:

    In any case, fetching the command params is done after the
    application has started running.

    No, the process /= application. There is a lot of things happening
    between creation of a process and the application running on the
    context of that process (or processes).

    OK, it has to load the PE and do a bunch of fixups, but eventually it
    will pass control to the entry point.

    Then, where are the command line parameters to be found?

    I can tell you they are not on the stack, which is where they must be
    if C's main(argc, argv) is to work properly.

    If calling conventions are to use the stack, then both argc and
    argv (a pointer) are on the stack, if not they are in the registers, why
    should I care?

    You are trying to make a point about some imaginary implementation. Even
    if your musings were true that would not prove or disprove anything. The
    API are as they are. The OS can send the command line to another end of
    the universe and back using quantum entanglement. So?

    They have to be put there Windows will not do that. The
    application's language's startup code must do it.

    That's the bit /I/ write.

    There was never objection to that. Prior to calling main() a lot of
    things happens, and?

    The point is that C, somehow, ended up with a scheme where that one
    line of commands WAS processed into convenient chunks for the
    application work work.

    Nope. You can call a program without having any commands at all:

    1. Windows:

    https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createprocessa

    2. Linux:

    https://www.man7.org/linux/man-pages/man3/posix_spawn.3.html

    Again, these are interfaces to bring C's argc and argv into a desired
    state, nothing more. There is no need to parse anything, just pass what
    you want a be done with that.

    But I can see that I'm banging my head against a brick wall:

    A quite unhealthy custom...

    * No one here is ever going to admit that Bart's Readln statements
    might actually be a good idea, despite C command-line processing
    doing pretty much the same thing.

    It is not. The real problems with this stuff is lack of typing and its
    low level nature. Consider passing another process as a parameter
    accompained by access rights, an end of a stream, an event etc. These
    produce an incredibly ugly code both under Windows and Linux.

    * And apparently no one is going to admit that that command-line
    processing is not actually done automatically by Windows; it is up to
    the startup code of a language implementation to get it sorted

    It is done automatically by the CRT.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Bart on Wed Dec 8 09:37:11 2021
    On 07/12/2021 21:49, Bart wrote:
    So, why can't a language also specify a set of defaults for proper line-reading routines:

       readln a, b, c

    But I can see that I'm banging my head against a brick wall:

     * No one here is ever going to admit that Bart's Readln statements
       might actually be a good idea, despite C command-line processing
       doing pretty much the same thing.


    You are not the only one who feels like he is banging his head against a
    wall. You misinterpret everything though your paranoia.

    Your "readln" could be a perfectly good solution - for /your/ language,
    and /your/ needs. Something similar could be useful in wider contexts.

    Where you are wrong, however, is in your believe that it is somehow
    better than any other solution, or that it covers other peoples' needs,
    or that it is somehow "fundamental" and something that should be part of
    any language.

    It doesn't cover typing, formatting, separators, syntax matching,
    errors, or any of a dozen other possible requirements. If you don't
    need any of that - you've just got a little script or a dedicated
    program run in a specific way - then that's fine. People who need
    something more, can't use it.

    And no, just because you don't like C's input facilities does not mean
    your methods are naturally superior.

    And just because someone says your readln is too limited and simplistic,
    does not mean they think C's alternatives are good or complete.

    A low-level language needs basic, raw input facilities that you can use
    to build the high-level input concepts that you need for your use. It
    does not need high-level input facilities - those should be in libraries
    so the user can choose what they need at the time (either from standard libraries of common solutions, or roll their own specialist one).

    A simple, limited high-level language can get away with saying "this is
    what you get - take it or leave it". Much of the programming world left
    such philosophies behind decades ago, but if you want to hang onto it
    with your own language, that's up to you - that's an advantage of having
    your own language.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bart@21:1/5 to David Brown on Wed Dec 8 11:22:35 2021
    On 08/12/2021 08:37, David Brown wrote:
    On 07/12/2021 21:49, Bart wrote:
    So, why can't a language also specify a set of defaults for proper
    line-reading routines:

       readln a, b, c

    But I can see that I'm banging my head against a brick wall:

     * No one here is ever going to admit that Bart's Readln statements
       might actually be a good idea, despite C command-line processing
       doing pretty much the same thing.


    You are not the only one who feels like he is banging his head against a wall. You misinterpret everything though your paranoia.

    Your "readln" could be a perfectly good solution - for /your/ language,
    and /your/ needs. Something similar could be useful in wider contexts.

    Where you are wrong, however, is in your believe that it is somehow
    better than any other solution, or that it covers other peoples' needs,
    or that it is somehow "fundamental" and something that should be part of
    any language.

    It doesn't cover typing, formatting, separators, syntax matching,
    errors, or any of a dozen other possible requirements. If you don't
    need any of that - you've just got a little script or a dedicated
    program run in a specific way - then that's fine. People who need
    something more, can't use it.

    And no, just because you don't like C's input facilities does not mean
    your methods are naturally superior.

    And just because someone says your readln is too limited and simplistic,
    does not mean they think C's alternatives are good or complete.

    A low-level language needs basic, raw input facilities that you can use
    to build the high-level input concepts that you need for your use. It
    does not need high-level input facilities - those should be in libraries
    so the user can choose what they need at the time (either from standard libraries of common solutions, or roll their own specialist one).

    A simple, limited high-level language can get away with saying "this is
    what you get - take it or leave it". Much of the programming world left
    such philosophies behind decades ago, but if you want to hang onto it
    with your own language, that's up to you - that's an advantage of having
    your own language.

    Having easy-to-use Read/Print statements doesn't mean more advanced or
    more customised ways of doing i/o are off the table. (For a start, I can
    call scanf/printf etc via the FFI of my language, if absolutely
    necessary, but it rarely is.)

    One important difference with mine is that they are line-oriented. C
    input especially is character-oriented so it will see \n as white space,
    and gives rise to all sorts of synchronisation issues when input
    (interactive from keyword, or text files) /is/ strongly line-oriented.

    This is an idea I had yesterday which I've now implemented:

    proc start =
    int a,b,c

    readln @cmdline, a, b, c

    println "Args:", a, b, c
    println "Total:", a + b + c
    end

    If I invoke this program like this:

    prog 10 20 30

    it will read those as numbers and print their sum. (If they're not
    numbers or just missing, it reads zeros.)

    But look, I can do this too:

    prog 10,20,30
    prog 10, 20, 30

    it still works! (Because it's using the same code as my Readln uses
    elsewhere.)

    Here's the C equivalent:

    #include <stdio.h> // for printf
    #include <stdlib.h> // for atoi

    int main(int n, char** argv) {
    int a = atoi(argv[1]);
    int b = atoi(argv[2]);
    int c = atoi(argv[3]);

    printf("Args: %d %d %d\n", a, b, c);
    printf("Total: %d\n", a + b + c);
    }

    If works with "10 20 30", and with "10, 20, 30". But with "10,20,30" or
    "10" or no input, it crashes.

    BTW for input of "10 20 30", n has the value 4, obviously!


    One more trick: I decide to make a, b, c floats. In my code, I just
    change "int" to "real", and it just works.

    In the C, I need to change "int" to "double", all "atoi" to "strtod",
    and all "%d" to "%f".

    Is that it? Not quite: strtod needs a NULL second argument. Or maybe
    'atof' could have been used?

    Yeah, you can clearly see how C is superior here; you need that precise control!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Harris@21:1/5 to Dmitry A. Kazakov on Mon Dec 13 17:42:50 2021
    On 07/12/2021 10:14, Dmitry A. Kazakov wrote:
    On 2021-12-07 10:44, David Brown wrote:
    On 07/12/2021 08:54, Dmitry A. Kazakov wrote:

    ...

    [ Though I think UNIX missed an opportunity to make it even worse.
    Consider if it not only expanded file lists but also opened the files
    and passed the file descriptors to the process! ]

    That would not make any sense - command line parameters are not
    necessarily files!

    They are, unless introduced by a symbol of a key, e.g. /a, -a, --a etc,
    so was the "convention." I never liked it, BTW.


    What convention would you prefer? I have tried to come up with something
    better but without success.

    BTW, words on a command line don't have to be file names.


    --
    James Harris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to James Harris on Mon Dec 13 20:45:51 2021
    On 2021-12-13 18:42, James Harris wrote:
    On 07/12/2021 10:14, Dmitry A. Kazakov wrote:
    On 2021-12-07 10:44, David Brown wrote:
    On 07/12/2021 08:54, Dmitry A. Kazakov wrote:

    ...

    [ Though I think UNIX missed an opportunity to make it even worse.
    Consider if it not only expanded file lists but also opened the files
    and passed the file descriptors to the process! ]

    That would not make any sense - command line parameters are not
    necessarily files!

    They are, unless introduced by a symbol of a key, e.g. /a, -a, --a
    etc, so was the "convention." I never liked it, BTW.


    What convention would you prefer? I have tried to come up with something better but without success.

    Same here. I prefer something that resembles a sentence, but it is
    difficult to remember too.

    BTW, words on a command line don't have to be file names.

    Yes, which is why expanding filename wildcards was a bad idea from the
    start.

    I think that there is no solution. A command line language is a problem
    domain one. All problem domain languages are bad no matter how you
    design them. It is a law of nature. So any command line language is
    necessarily doomed.

    An alternative existed since early days when OS came from the same
    vendor. Many things were configurable per unified UI. That reduced need
    in the command line languages. File managers helped to get rid on
    command line file operations. IDE helped with compiler switches. Around
    mid-90s all UIs switched to OO. You clicked the mouse on the object and
    got the list of "virtual" functions applicable to the object. [This does
    not work well with many objects, that pesky multiple dispatch is in the
    way.]

    Actually both Windows and Linux go this path deploying registry, XML,
    SQLite DBs, UIs, managers etc to keep and modify parameters instead of
    using commands. The success is sort of questionable.

    P.S. The younger generation seems to be unaware of command line
    interfaces. It was fun to watch Linus Tech Tips (LTT) Linux challenge
    series on youtube. Two guys in 30s capable to install NAS servers and
    tinkering hardware could not configure and use a Linux box fighting
    through macabre Linux GUIs when a command line would do the work in 5
    minutes.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)