• Underscoring numbers in Forth

    From dxforth@21:1/5 to All on Sat Jul 23 16:24:07 2022
    32/64-bit machines have increased the risk of entering numbers incorrectly. Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?

    Usual suspects pre-answered.

    Q. Why the underscore character?
    A. It's not one of the characters Forth Inc uses to denote a double number.
    It's increasingly used in programming languages for this purpose. Even
    XPL0 has it.

    A. ANS didn't see the need for it.
    Q. Are you married?

    Q. Should >NUMBER process the underscore?
    A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.

    Q. Then you'll need a routine to strip the underscores and a temporary buffer
    to hold the result. What do you suggest?
    A. The HOLD buffer.

    Q. Won't it interfere with numeric output?
    A. Input/output are usually mutually exclusive.

    Q. Won't the HOLD buffer need to be larger to hold the punctuation?
    A. Assuming worst case and one underscore per 4 characters, 20% larger.

    Q. Is all this just c.l.f. speculation - or have you implemented it?
    A. Implemented

    Q. Has it broken anything?
    A. Not AFAIK

    Q. What did it cost?
    A. 34 bytes on 8086, 39 bytes on 8080

    Q. Can't it be done using recognizers?
    A. If so, probably at more cost.

    Q. Will you keep it?
    A. Good question. For 16-bit integers its value may be marginal. How often
    do you enter values in binary?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to dxforth on Sat Jul 23 16:30:58 2022
    On 23/07/2022 16:24, dxforth wrote:

    Q. Won't the HOLD buffer need to be larger to hold the punctuation?
    A. Assuming worst case and one underscore per 4 characters, 20% larger.

    Q. Hang on - doesn't the buffer hold the _converted_ string?
    A. Correct. The HOLD buffer doesn't need to be larger.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marcel Hendrix@21:1/5 to dxforth on Sat Jul 23 01:02:26 2022
    On Saturday, July 23, 2022 at 8:24:09 AM UTC+2, dxforth wrote:
    32/64-bit machines have increased the risk of entering numbers incorrectly. Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?

    Usual suspects pre-answered.

    Q. Why the underscore character?
    A. It's not one of the characters Forth Inc uses to denote a double number. It's increasingly used in programming languages for this purpose. Even
    XPL0 has it.

    Q. Should >NUMBER process the underscore?
    A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.

    [..]
    Q. Has it broken anything?
    A. Not AFAIK
    [..]

    What exactly is your idea?

    "... certain punctuation e.g. underscore ... "

    I guess you are talking about integer single precision, i.e. you want
    _1000, 1_000, 10_00, 100_0, 1000_, _1__0_0_0____ all to map to 1000
    in the current BASE? This_is_dead_beef ?

    When >NUMBER doesn't handle it, how does it get recognized as an
    integer by the rest of the system? Why not have the application filter
    it when it wants to support this?

    -marcel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to mhx@iae.nl on Sat Jul 23 12:16:46 2022
    In article <7ac8f1a1-c173-4dff-930f-2e29aa5990ccn@googlegroups.com>,
    Marcel Hendrix <mhx@iae.nl> wrote:
    On Saturday, July 23, 2022 at 8:24:09 AM UTC+2, dxforth wrote:
    32/64-bit machines have increased the risk of entering numbers incorrectly. >> Should the Forth interpreter be allowed to ignore certain punctuation e.g. >> underscore in numbers? What would be the issues?

    Usual suspects pre-answered.

    Q. Why the underscore character?
    A. It's not one of the characters Forth Inc uses to denote a double number. >> It's increasingly used in programming languages for this purpose. Even
    XPL0 has it.

    Q. Should >NUMBER process the underscore?
    A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker. >>
    [..]
    Q. Has it broken anything?
    A. Not AFAIK
    [..]

    What exactly is your idea?

    "... certain punctuation e.g. underscore ... "

    I guess you are talking about integer single precision, i.e. you want
    _1000, 1_000, 10_00, 100_0, 1000_, _1__0_0_0____ all to map to 1000
    in the current BASE? This_is_dead_beef ?

    When >NUMBER doesn't handle it, how does it get recognized as an
    integer by the rest of the system? Why not have the application filter
    it when it wants to support this?

    NUMBER is carefully designed to be interruptable.
    It could handle extra characters, e.g. a traditional use of
    finding the place of the decimal point (for fixed point numbers).

    0. "1111.1111" >NUMBER OVER C@ &. = IF OVER DPL ! /STRING THEN >NUMBER

    Handling _ without changing >NUMBER, but yet using is, is left as an
    exercise for the reader.

    I admit that >NUMBER is a reasonable factor, but I don't care a bit
    about the suggestion to use in a Forth kernel (political correct Forth).
    So it is not used in ciforth, and could be relegated to a loadable extension.


    -marcel

    &. is a notation that replace '.' in
    A decimal point in the middle of a word is non-standard.

    Groetjes Albert
    --
    "in our communism country Viet Nam, people are forced to be
    alive and in the western country like US, people are free to
    die from Covid 19 lol" duc ha
    albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to dxforth on Sat Jul 23 11:50:41 2022
    dxforth <dxforth@gmail.com> writes:
    32/64-bit machines have increased the risk of entering numbers incorrectly.

    And reading the entered numbers. Who can tell quickly what order of
    magnitude 1000000000000 has?

    It's also about outputted numbers. Yes, I can write an output routine
    that outputs 8_888_888_888_888 for readability, but if I cannot cut
    that number and paste it back in (which I occasionally have to do), I
    shy away from that.

    Should the Forth interpreter be allowed to ignore certain punctuation e.g. >underscore in numbers?

    It is allowed that already.

    Usual suspects pre-answered.

    Q. Why the underscore character?
    A. It's not one of the characters Forth Inc uses to denote a double number.
    It's increasingly used in programming languages for this purpose. Even
    XPL0 has it.

    Very sensible. Who are you and what have you done to dxforth:-)

    Q. Should >NUMBER process the underscore?
    A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.

    I don't see strong reasons either way.

    Q. Then you'll need a routine to strip the underscores and a temporary buffer
    to hold the result. What do you suggest?
    A. The HOLD buffer.

    No such buffer is needed. That's the beauty of >NUMBER, which has
    been designed for a very similar use case:

    : >number_ ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )
    \ like >number, but ignores _
    begin
    >number
    dup 0> while
    over c@ '_' = while
    1 /string
    repeat then ;

    Whould the buffer option be smaller?

    : >number_ ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )
    \ not tested or debugged
    holdbuf >r
    begin
    over c@ dup digit? if
    drop r> c!+ r>
    else
    '_' <> if
    holdbuf r> over - 2swap 2>r >number 2drop 2r> exit then
    again ;

    Do you manage any better?

    Q. Won't it interfere with numeric output?
    A. Input/output are usually mutually exclusive.

    Says who?

    Q. Can't it be done using recognizers?
    A. If so, probably at more cost.

    What makes you think so?

    Q. Will you keep it?
    A. Good question. For 16-bit integers its value may be marginal. How often
    do you enter values in binary?

    I have been thinking about adding this feature for a while. I expect
    that I will do so at some point in the future.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Marcel Hendrix on Sun Jul 24 00:10:27 2022
    On 23/07/2022 18:02, Marcel Hendrix wrote:

    What exactly is your idea?

    "... certain punctuation e.g. underscore ... "

    I guess you are talking about integer single precision, i.e. you want
    _1000, 1_000, 10_00, 100_0, 1000_, _1__0_0_0____ all to map to 1000
    in the current BASE? This_is_dead_beef ?

    Any character string representing a number sent to the forth interpreter.
    The idea is to strip the underscores just before forth tries to convert
    the string to a number. The catch is it mustn't be found in the dictionary which is classically searched first. This effectively means you can't
    use underscore in a word name - or risk your number being found.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Sat Jul 23 07:28:33 2022
    32/64-bit machines have increased the risk of entering numbers incorrectly. Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?

    No such risk in case of underscore; to enter underscore character one has
    to press Shift-Minus — it can be done only on purpose.
    I believe one character that could be ignored the way you propose is space. When entering long numbers it may be comfortable to, for example, separate thousands by adding single space among them. It's easier to check the input before final Enter-press.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@arcor.de@21:1/5 to Zbig on Sat Jul 23 14:45:41 2022
    Zbig schrieb am Samstag, 23. Juli 2022 um 16:28:34 UTC+2:
    32/64-bit machines have increased the risk of entering numbers incorrectly.
    Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?
    No such risk in case of underscore; to enter underscore character one has
    to press Shift-Minus — it can be done only on purpose.
    I believe one character that could be ignored the way you propose is space. When entering long numbers it may be comfortable to, for example, separate thousands by adding single space among them. It's easier to check the input before final Enter-press.

    You are timidly entering the gritty realm of locales ... ;-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Sat Jul 23 15:37:41 2022
    32/64-bit machines have increased the risk of entering numbers incorrectly.
    Should the Forth interpreter be allowed to ignore certain punctuation e.g.
    underscore in numbers? What would be the issues?
    No such risk in case of underscore; to enter underscore character one has to press Shift-Minus — it can be done only on purpose.
    I believe one character that could be ignored the way you propose is space.
    When entering long numbers it may be comfortable to, for example, separate thousands by adding single space among them. It's easier to check the input
    before final Enter-press.
    You are timidly entering the gritty realm of locales ... ;-)

    Not quite. I'm of course aware, that some countries use comma and dot for said "thousand separators", but both comma and dot characters are usually interpreted
    as "double" mark in Forth. So only the space can be used as "separator".

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Anton Ertl on Sun Jul 24 13:22:03 2022
    On 23/07/2022 21:50, Anton Ertl wrote:
    dxforth <dxforth@gmail.com> writes:
    ...
    Q. Then you'll need a routine to strip the underscores and a temporary buffer >> to hold the result. What do you suggest?
    A. The HOLD buffer.

    No such buffer is needed. That's the beauty of >NUMBER, which has
    been designed for a very similar use case:

    : >number_ ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )
    \ like >number, but ignores _
    begin
    >number
    dup 0> while
    over c@ '_' = while
    1 /string
    repeat then ;

    The idea was to avoid separate number converters.

    Q. Won't it interfere with numeric output?
    A. Input/output are usually mutually exclusive.

    Says who?

    Humans - who use the same mouth to eat and speak.


    Q. Can't it be done using recognizers?
    A. If so, probably at more cost.

    What makes you think so?

    The 30 odd bytes I spent would be hard to beat.

    My implementation is sound enough. It's the potential for underscored
    numbers to collide with dictionary entries that's the problem. 200x
    character literals have the same issue but there the risk is manageable
    since it involves strings of 3 characters only one of which is variable.
    What comes from trying to import foreign ideas into Forth.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to dxforth on Sun Jul 24 14:08:36 2022
    On 24/07/2022 13:22, dxforth wrote:

    It's the potential for underscored
    numbers to collide with dictionary entries that's the problem.

    Collisions might be reduced sufficiently by requiring underscored
    numbers begin with an underscore. Not fool-proof but then neither
    were 200x character literals.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ron AARON@21:1/5 to dxforth on Sun Jul 24 07:26:43 2022
    On 23/07/2022 9:24, dxforth wrote:
    32/64-bit machines have increased the risk of entering numbers incorrectly. Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?

    I implemented underscores-in-numbers a while back in 8th, at no
    perceivable cost. Makes large numbers much easier to understand.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Ron AARON on Sun Jul 24 14:42:32 2022
    On 24/07/2022 14:26, Ron AARON wrote:


    On 23/07/2022 9:24, dxforth wrote:
    32/64-bit machines have increased the risk of entering numbers incorrectly. >> Should the Forth interpreter be allowed to ignore certain punctuation e.g. >> underscore in numbers? What would be the issues?

    I implemented underscores-in-numbers a while back in 8th, at no
    perceivable cost. Makes large numbers much easier to understand.

    What about dictionary collisions - or does 8th handle numbers differently?
    Any class of number or just integers?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ron AARON@21:1/5 to dxforth on Sun Jul 24 08:15:28 2022
    On 24/07/2022 7:42, dxforth wrote:
    On 24/07/2022 14:26, Ron AARON wrote:


    On 23/07/2022 9:24, dxforth wrote:
    32/64-bit machines have increased the risk of entering numbers incorrectly. >>> Should the Forth interpreter be allowed to ignore certain punctuation e.g. >>> underscore in numbers? What would be the issues?

    I implemented underscores-in-numbers a while back in 8th, at no
    perceivable cost. Makes large numbers much easier to understand.

    What about dictionary collisions - or does 8th handle numbers differently? Any class of number or just integers?

    The dictionary is searched first, so : 123_456 ; will be found if
    "123_456" is entered. Numbers are attempted to be parsed after words, so
    it's possible to override e.g. "8" if you wanted to.

    Any kind of number allows the underscore, including "big integers" and
    "big floats". The underscore is simply ignored inside number parsing.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Pelc@21:1/5 to dxforth on Sun Jul 24 08:29:29 2022
    On 24 Jul 2022 at 06:42:32 CEST, "dxforth" <dxforth@gmail.com> wrote:

    I implemented underscores-in-numbers a while back in 8th, at no
    perceivable cost. Makes large numbers much easier to understand.

    What about dictionary collisions - or does 8th handle numbers differently? Any class of number or just integers?

    Once you have decided that numbers should have an ignoreable character
    you might as well replace all occurrences of that literal by a variable. Once you have a variable, you can now choose the ignoreable character at
    run-time, e.g.
    ':' ign-char !

    You can use a similar mechanism for the DP and FP separators. Since
    a variable is larger than a byte, you can treat the variables as n-char
    arrays in which any match satisfies. VFX has used this mechanism for
    decades to allow users to have locale-sensitive DP and FP numbers.
    Since we made this change there have been no whines from the
    standards lawyers and no technical support issues.

    Stephen
    --
    Stephen Pelc, stephen@vfxforth.com
    MicroProcessor Engineering, Ltd. - More Real, Less Time
    133 Hill Lane, Southampton SO15 5AF, England
    tel: +44 (0)23 8063 1441, +44 (0)78 0390 3612, +34 649 662 974 http://www.mpeforth.com - free VFX Forth downloads

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to dxforth on Sun Jul 24 09:35:47 2022
    dxforth <dxforth@gmail.com> writes:
    On 23/07/2022 21:50, Anton Ertl wrote:
    dxforth <dxforth@gmail.com> writes:
    ...
    Q. Then you'll need a routine to strip the underscores and a temporary buffer
    to hold the result. What do you suggest?
    A. The HOLD buffer.

    No such buffer is needed. That's the beauty of >NUMBER, which has
    been designed for a very similar use case:

    : >number_ ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )
    \ like >number, but ignores _
    begin
    >number
    dup 0> while
    over c@ '_' = while
    1 /string
    repeat then ;

    The idea was to avoid separate number converters.

    I have no idea what you mean with that.

    Q. Won't it interfere with numeric output?
    A. Input/output are usually mutually exclusive.

    Says who?

    Humans - who use the same mouth to eat and speak.

    And the relevance to conversion from strings to numbers and numbers to
    strings is?

    Q. Can't it be done using recognizers?
    A. If so, probably at more cost.

    What makes you think so?

    The 30 odd bytes I spent would be hard to beat.

    Moving the goalposts? I did not ask about beating.

    Why makes you think that the cost would be higher rather than just the
    same if one applies the same change to a pluggable number recognizer
    rather than a hardwired one?

    It's the potential for underscored
    numbers to collide with dictionary entries that's the problem.

    That's no problem, just like the potential for other numbers to
    collide with dictionary entries is no problem:

    Dictionary entries are searched first, so if you have a word _ or __
    or _1 or 1_ etc., it will be found before the number recognizer tries
    to convert it into a number. The conventional way to avoid a number
    being shadowed by a dictionary entry is to start the number with one
    of the digits 0-9 (and avoiding dictionary entries that start with
    these digits, albeit there are some exceptions that prove this rule).

    200x character literals have the same issue

    It's the same non-issue for the same reason. And, guess what, no
    problems have been reported to us, neither for the Forth-2012
    character literals ('a', implemented in Gforth since 0.7 (2008)), nor
    for Gforth's older syntax ('a, implemented in Gforth since the first
    public release (1996)).

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Stephen Pelc on Sun Jul 24 10:05:23 2022
    Stephen Pelc <stephen@vfxforth.com> writes:
    Once you have decided that numbers should have an ignoreable character
    you might as well replace all occurrences of that literal by a variable. Once >you have a variable, you can now choose the ignoreable character at
    run-time, e.g.
    ':' ign-char !

    That may be a way for vendors to placate their customers if they all
    want some different ignore-character, but if you want a common
    language for exchanging libraries, studying programs etc, it's a bad
    idea. And given that no other viable ignore-characters apart from _
    has been proposed (I don't consider space, comma, and dot to be a
    viable ignore-characters in Forth) despite the frequent urge to
    bike-shed such small changes to death, why propose this misfeature in
    your first posting on this topic?

    Your example is especially nasty because ':' is a double indicator in SwiftForth. So someone following your suggestion would produce
    programs that behave quite differently in SwiftForth.

    You can use a similar mechanism for the DP and FP separators.

    Also bad ideas for language commonality. If you want to accept
    decimal comma, accept it in addition to the decimal point. No need
    for variables.

    Since we made this change there have been [...] no technical support issues.

    If library authors made use of this misfeature, and an application
    author would trip over that, would you get a support call? I guess,
    though, that library authors are smart enough to stay clear of it.
    But if a "feature" is best avoided, why provide it at all?

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to zbigniew2011@gmail.com on Sun Jul 24 14:03:28 2022
    In article <034d9af5-154a-431d-a469-f7054b9c0bb1n@googlegroups.com>,
    Zbig <zbigniew2011@gmail.com> wrote:
    32/64-bit machines have increased the risk of entering numbers incorrectly. >> Should the Forth interpreter be allowed to ignore certain punctuation e.g. >> underscore in numbers? What would be the issues?

    No such risk in case of underscore; to enter underscore character one has
    to press Shift-Minus — it can be done only on purpose.
    I believe one character that could be ignored the way you propose is space. >When entering long numbers it may be comfortable to, for example, separate >thousands by adding single space among them. It's easier to check the input >before final Enter-press.

    Using spaces in numbers? In Forth this is a bad idea.
    Underscores, yes.

    Groetjes Albert
    --
    "in our communism country Viet Nam, people are forced to be
    alive and in the western country like US, people are free to
    die from Covid 19 lol" duc ha
    albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Sun Jul 24 05:33:31 2022
    Using spaces in numbers? In Forth this is a bad idea.
    Underscores, yes.

    Who, apart of Forth programmer, will use underscore when entering
    any number?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Zbig on Sun Jul 24 12:53:48 2022
    Zbig <zbigniew2011@gmail.com> writes:
    Who, apart of Forth programmer, will use underscore when entering
    any number?

    Accoding to
    <https://en.wikipedia.org/wiki/Decimal_separator#Digit_grouping>:

    |maritime "21_450"

    and (more relevant):

    |Ada, C# (from version 7.0[34]), D, Haskell (from GHC version 8.6.1),
    |Java, Kotlin,[35] OCaml, Perl, Python (from version 3.6), PHP (from
    |version 7.4[36]), Ruby, Go (from version 1.13), Rust, Julia, and
    |Swift use the underscore (_) character for this purpose

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Sun Jul 24 06:19:15 2022
    Who, apart of Forth programmer, will use underscore when entering
    any number?
    Accoding to <https://en.wikipedia.org/wiki/Decimal_separator#Digit_grouping>:

    |maritime "21_450"

    Correction: apart of Forth programmer and a sailor.

    and (more relevant):

    |Ada, C# (from version 7.0[34]), D, Haskell (from GHC version 8.6.1),
    |Java, Kotlin,[35] OCaml, Perl, Python (from version 3.6), PHP (from |version 7.4[36]), Ruby, Go (from version 1.13), Rust, Julia, and
    |Swift use the underscore (_) character for this purpose

    OK, so I'm asking the same question to creators of Ada, C# (from version 7.0[34]), D, Haskell (from GHC version 8.6.1), Java, Kotlin,[35] OCaml, Perl, Python (from version 3.6), PHP (from version 7.4[36]), Ruby, Go (from
    version 1.13), Rust, Julia, and Swift: who, apart of the sailors and apart
    of the programmers, that were told "use underscore" — indeed uses
    underscore when entering any numbers?

    The question is serious; I never saw anyone, who was using underscore
    to enter number — well, maybe indeed it's commonly used somewhere
    for that purpose (like I had no idea some sailors use that).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ron AARON@21:1/5 to Anton Ertl on Sun Jul 24 16:26:54 2022
    On 24/07/2022 15:53, Anton Ertl wrote:
    Zbig <zbigniew2011@gmail.com> writes:
    Who, apart of Forth programmer, will use underscore when entering
    any number?

    Accoding to
    <https://en.wikipedia.org/wiki/Decimal_separator#Digit_grouping>:

    |maritime "21_450"

    and (more relevant):

    |Ada, C# (from version 7.0[34]), D, Haskell (from GHC version 8.6.1),
    |Java, Kotlin,[35] OCaml, Perl, Python (from version 3.6), PHP (from
    |version 7.4[36]), Ruby, Go (from version 1.13), Rust, Julia, and
    |Swift use the underscore (_) character for this purpose

    - anton

    Indeed; it was because someone asked for it based on Python's example,
    that I did eventually add it into 8th.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Anton Ertl on Mon Jul 25 00:02:25 2022
    On 24/07/2022 19:35, Anton Ertl wrote:
    dxforth <dxforth@gmail.com> writes:
    On 23/07/2022 21:50, Anton Ertl wrote:
    dxforth <dxforth@gmail.com> writes:
    ...
    Q. Then you'll need a routine to strip the underscores and a temporary buffer
    to hold the result. What do you suggest?
    A. The HOLD buffer.

    No such buffer is needed. That's the beauty of >NUMBER, which has
    been designed for a very similar use case:

    : >number_ ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )
    \ like >number, but ignores _
    begin
    >number
    dup 0> while
    over c@ '_' = while
    1 /string
    repeat then ;

    The idea was to avoid separate number converters.

    I have no idea what you mean with that.

    You just created one. Will you create another for floats?


    Q. Won't it interfere with numeric output?
    A. Input/output are usually mutually exclusive.

    Says who?

    Humans - who use the same mouth to eat and speak.

    And the relevance to conversion from strings to numbers and numbers to strings is?

    I see no reason for them to collide.


    Q. Can't it be done using recognizers?
    A. If so, probably at more cost.

    What makes you think so?

    The 30 odd bytes I spent would be hard to beat.

    Moving the goalposts? I did not ask about beating.

    Why makes you think that the cost would be higher rather than just the
    same if one applies the same change to a pluggable number recognizer
    rather than a hardwired one?

    Feel free to show the code for the plug-in.

    It's the potential for underscored
    numbers to collide with dictionary entries that's the problem.

    That's no problem, just like the potential for other numbers to
    collide with dictionary entries is no problem:

    Dictionary entries are searched first, so if you have a word _ or __
    or _1 or 1_ etc., it will be found before the number recognizer tries
    to convert it into a number. The conventional way to avoid a number
    being shadowed by a dictionary entry is to start the number with one
    of the digits 0-9 (and avoiding dictionary entries that start with
    these digits, albeit there are some exceptions that prove this rule).

    Fair enough


    200x character literals have the same issue

    It's the same non-issue for the same reason. And, guess what, no
    problems have been reported to us, neither for the Forth-2012
    character literals ('a', implemented in Gforth since 0.7 (2008)), nor
    for Gforth's older syntax ('a, implemented in Gforth since the first
    public release (1996)).

    - anton

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Sun Jul 24 07:17:59 2022
    |Swift use the underscore (_) character for this purpose

    BTW: I think if „space” is too difficult to use it as „thousand separator”,
    ignored by Forth, I got a better „candidate”: Vertical Tab (0Bh):
    — it's practically unused anywhere
    — it could be entered with, say, Shift-Space
    — it could be displayed as, guess what, just a single space

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Pelc@21:1/5 to All on Sun Jul 24 14:28:17 2022
    On 24 Jul 2022 at 12:05:23 CEST, "Anton Ertl" <Anton Ertl> wrote:

    Your response is a typical "not invented here" response.

    The DP and FP character definitions solve a *real* issue in that the
    Forth standard approach cannot be used for real-world data entry.
    The DP and FP char solution allows the double and FP data entry
    routines to be used for data entry in various locales.

    Your example is especially nasty because ':' is a double indicator in SwiftForth. So someone following your suggestion would produce
    programs that behave quite differently in SwiftForth.

    I have a dispute resolution protocol in a contract (yes, really) that includes the line:
    "Dispute resolution processes include the consumption of alcoholic
    beverages, food and laughter."
    Leon at Forth Inc and I are perfectly capable of finding a resolution.

    Also bad ideas for language commonality. If you want to accept
    decimal comma, accept it in addition to the decimal point. No need
    for variables.

    I think that you do not understand locales.

    Stephen

    --
    Stephen Pelc, stephen@vfxforth.com
    MicroProcessor Engineering, Ltd. - More Real, Less Time
    133 Hill Lane, Southampton SO15 5AF, England
    tel: +44 (0)23 8063 1441, +44 (0)78 0390 3612, +34 649 662 974 http://www.mpeforth.com - free VFX Forth downloads

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to dxforth on Sun Jul 24 14:29:47 2022
    dxforth <dxforth@gmail.com> writes:
    On 24/07/2022 19:35, Anton Ertl wrote:
    dxforth <dxforth@gmail.com> writes:
    The idea was to avoid separate number converters.

    I have no idea what you mean with that.

    You just created one. Will you create another for floats?

    I have no such plans. But now I know what you mean.

    And the relevance to conversion from strings to numbers and numbers to
    strings is?

    I see no reason for them to collide.

    I do: Putting debugging output in string->number conversion words.

    Q. Can't it be done using recognizers?
    A. If so, probably at more cost.

    What makes you think so?

    The 30 odd bytes I spent would be hard to beat.

    Moving the goalposts? I did not ask about beating.

    Why makes you think that the cost would be higher rather than just the
    same if one applies the same change to a pluggable number recognizer
    rather than a hardwired one?

    Feel free to show the code for the plug-in.

    This won't help at all, because it is not changed:

    : rec-num ( addr u -- n/d table | notfound ) \ gforth-experimental
    \G converts a number to a single/double integer
    snumber? dup
    IF
    0> IF ['] recognized-dnum ELSE ['] recognized-num THEN EXIT
    THEN
    drop ['] notfound ;

    The change is in s>unumber?, which is called (with one intermediate)
    by snumber?. Both snumber? and s>unumber? already exist in
    gforth-0.7, i.e., before recognizers. And the change consists of
    replacing a call to >NUMBER with a call to >NUMBER_.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Stephen Pelc on Sun Jul 24 15:27:24 2022
    Stephen Pelc <stephen@vfxforth.com> writes:
    On 24 Jul 2022 at 12:05:23 CEST, "Anton Ertl" <Anton Ertl> wrote:

    Your response is a typical "not invented here" response.

    Starting out with name-calling is a typical defense of someone who
    does not have convincing arguments to support his position.

    The DP and FP character definitions solve a *real* issue in that the
    Forth standard approach cannot be used for real-world data entry.
    The DP and FP char solution allows the double and FP data entry
    routines to be used for data entry in various locales.

    One question is if the source code should be subject to the current
    locale. This would mean that when using the program in another
    locale, it would have to be changed.

    All programming languages I know of define their source code
    independent of the locale. The exception was Algol 60, which did not
    define the computer representation of their source code at all.

    User input is generally something different from source code, although
    in interactive languages there can be some overlap. This has not
    caused interactive languages to change their source code
    interpretation depending on locale.

    Your example is especially nasty because ':' is a double indicator in
    SwiftForth. So someone following your suggestion would produce
    programs that behave quite differently in SwiftForth.

    I have a dispute resolution protocol in a contract (yes, really) that includes >the line:
    "Dispute resolution processes include the consumption of alcoholic
    beverages, food and laughter."
    Leon at Forth Inc and I are perfectly capable of finding a resolution.

    Your suggestion might be picked up by some programmer who then
    produces a program, and some other user may pick the source code up
    and use it in SwiftForth, and waste quite a bit of time trying to find
    out what's wrong. I don't think your dispute resolution protocol is
    going to help him.

    Also bad ideas for language commonality. If you want to accept
    decimal comma, accept it in addition to the decimal point. No need
    for variables.

    I think that you do not understand locales.

    Elucidate me!

    I just looked at the 358 locale files in /usr/share/i18n/locales on my
    Debian 11 installation, and found three decimal_point and
    mon_decimal_point values:

    .
    ,
    <U066B> (in fa_IR (Persian (Iran)) and ps_AF (Pashto (Afghanistan)))

    The latter is an extended character that takes two bytes in UTF-8, so
    your variable approach cannot deal with it (if I understand it
    correctly).

    So treating both '.' and ',' as double-defining characters (what
    SwiftForth does) covers all the locales that your variable approach
    can cover.

    For thousands_sep the variants are:

    ""
    ","
    "."
    <U066C> (Arabic Thousands Separator)
    <U2019> (Right Single Quotation Mark)
    <U202F> (Narrow No-Break Space)

    I have now added a locale PROG that uses _ as thousands separator, so
    I can do things like

    LC_NUMERIC=PROG.utf8 perf stat true

    and it outputs lines like

    754_957 cycles # 3.667 GHz
    562_046 instructions # 0.74 insn per cycle
    112_617 branches # 546.987 M/sec
    4_530 branch-misses # 4.02% of all branches

    and I can then paste these numbers into Forth and compute with them.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Anton Ertl on Mon Jul 25 02:28:21 2022
    On 25/07/2022 00:29, Anton Ertl wrote:
    dxforth <dxforth@gmail.com> writes:
    On 24/07/2022 19:35, Anton Ertl wrote:
    dxforth <dxforth@gmail.com> writes:
    The idea was to avoid separate number converters.

    I have no idea what you mean with that.

    You just created one. Will you create another for floats?

    I have no such plans. But now I know what you mean.

    And the relevance to conversion from strings to numbers and numbers to
    strings is?

    I see no reason for them to collide.

    I do: Putting debugging output in string->number conversion words.

    That's akin to typing:

    BL WORD name FIND

    and expecting it to work. Forth at fault - or the operator for not understanding his tools?

    Why makes you think that the cost would be higher rather than just the
    same if one applies the same change to a pluggable number recognizer
    rather than a hardwired one?

    Feel free to show the code for the plug-in.

    This won't help at all, because it is not changed:

    : rec-num ( addr u -- n/d table | notfound ) \ gforth-experimental
    \G converts a number to a single/double integer
    snumber? dup
    IF
    0> IF ['] recognized-dnum ELSE ['] recognized-num THEN EXIT
    THEN
    drop ['] notfound ;

    The change is in s>unumber?, which is called (with one intermediate)
    by snumber?. Both snumber? and s>unumber? already exist in
    gforth-0.7, i.e., before recognizers. And the change consists of
    replacing a call to >NUMBER with a call to >NUMBER_.

    Wanting to handle all numbers, my hardwired solution was:

    ; strip ( c-addr u c-addr2 -- c-addr3 u3 )

    hdr x,'STRIP',,1
    strip: pop di
    pop cx
    pop bx
    add bx,cx ; start at end
    sub dx,dx
    strip1: jcxz strip3
    dec bx ; builds down
    mov al,[bx]
    cmp al,'_'
    jz strip2
    dec di
    mov [di],al
    inc dx
    strip2: dec cx
    jmp strip1
    strip3: push di
    push dx
    nextt
    ; 28 bytes

    \ forth number interpreter
    : number ( c-addr -- n|d|r xt )
    count

    pad \ *new* HOLD buffer end+1
    strip \ *new* move string to HOLD buffer sans underscore
    \ 4 bytes

    If someone finds the strategy I employed is broken, I'll take my licks.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to dxforth on Mon Jul 25 14:07:48 2022
    On 25/07/2022 02:28, dxforth wrote:

    Wanting to handle all numbers, my hardwired solution was:

    ; strip ( c-addr u c-addr2 -- c-addr3 u3 )

    hdr x,'STRIP',,1
    strip: pop di
    pop cx
    pop bx
    add bx,cx ; start at end
    sub dx,dx
    strip1: jcxz strip3
    dec bx ; builds down
    mov al,[bx]
    cmp al,'_'
    jz strip2
    dec di
    mov [di],al
    inc dx
    strip2: dec cx
    jmp strip1
    strip3: push di
    push dx
    nextt

    I've replaced the assembler routine above as there was no check for overflow. Stack effects have changed as HOLD buffer is automatically referenced.

    : strip ( c-addr u -- c-addr2 u2 )
    <# 2dup 1- over + do
    i c@ [char] _ over - if hold else drop then
    -1 +loop #> ;

    Won't work on zero-length strings but irrelevant here.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to zbigniew2011@gmail.com on Mon Jul 25 10:33:06 2022
    In article <b087485d-393c-4e22-acf0-a98d20301fben@googlegroups.com>,
    Zbig <zbigniew2011@gmail.com> wrote:
    |Swift use the underscore (_) character for this purpose

    BTW: I think if „space” is too difficult to use it as „thousand >separator”,
    ignored by Forth, I got a better „candidate”: Vertical Tab (0Bh):
    — it's practically unused anywhere
    — it could be entered with, say, Shift-Space
    — it could be displayed as, guess what, just a single space

    Terrible bad idea, because it can be visually discerned from a space.
    Tab is not a glyph, but a control of mechanical type writers.

    Groetjes Albert
    --
    "in our communism country Viet Nam, people are forced to be
    alive and in the western country like US, people are free to
    die from Covid 19 lol" duc ha
    albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Mon Jul 25 01:53:15 2022
    BTW: I think if „space” is too difficult to use it as „thousand >separator”,
    ignored by Forth, I got a better „candidate”: Vertical Tab (0Bh):
    — it's practically unused anywhere
    — it could be entered with, say, Shift-Space
    — it could be displayed as, guess what, just a single space
    Terrible bad idea, because it can be visually discerned from a space.
    Tab is not a glyph, but a control of mechanical type writers.

    Mechanical type writers aren't used (since very long time) anymore,
    so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ron AARON@21:1/5 to Zbig on Mon Jul 25 12:22:28 2022
    On 25/07/2022 11:53, Zbig wrote:
    BTW: I think if „space” is too difficult to use it as „thousand
    separator”,
    ignored by Forth, I got a better „candidate”: Vertical Tab (0Bh):
    — it's practically unused anywhere
    — it could be entered with, say, Shift-Space
    — it could be displayed as, guess what, just a single space
    Terrible bad idea, because it can be visually discerned from a space.
    Tab is not a glyph, but a control of mechanical type writers.

    Mechanical type writers aren't used (since very long time) anymore,
    so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.

    True enough; but the point that you can't see it unless a special font
    is used, is a valid one.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Mon Jul 25 02:28:17 2022
    I got a better „candidate”: Vertical Tab (0Bh):
    — it's practically unused anywhere
    — it could be entered with, say, Shift-Space
    — it could be displayed as, guess what, just a single space
    Terrible bad idea, because it can be visually discerned from a space.
    Tab is not a glyph, but a control of mechanical type writers.

    Mechanical type writers aren't used (since very long time) anymore,
    so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.
    True enough; but the point that you can't see it unless a special font
    is used, is a valid one.

    Maybe my assumption is different, but actually I don't see any need to make
    it visible. I treat it as kind of „hard space” („non-breakable”) used sometimes
    in text editors. I see its supposed invisibility rather as advantage.
    Of course if for some particular reasons that character should be visible, underscore
    may be good enough.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rick C@21:1/5 to Zbig on Mon Jul 25 09:05:29 2022
    On Monday, July 25, 2022 at 5:28:18 AM UTC-4, Zbig wrote:
    I got a better „candidate”: Vertical Tab (0Bh):
    — it's practically unused anywhere
    — it could be entered with, say, Shift-Space
    — it could be displayed as, guess what, just a single space
    Terrible bad idea, because it can be visually discerned from a space. >> Tab is not a glyph, but a control of mechanical type writers.

    Mechanical type writers aren't used (since very long time) anymore,
    so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.
    True enough; but the point that you can't see it unless a special font
    is used, is a valid one.
    Maybe my assumption is different, but actually I don't see any need to make it visible. I treat it as kind of „hard space” („non-breakable”) used sometimes
    in text editors. I see its supposed invisibility rather as advantage.
    Of course if for some particular reasons that character should be visible, underscore
    may be good enough.

    I don't follow this. The entire point of a thousands separator is to facilitate humans reading large numbers or small fractions. Wouldn't this separator be ignored by any computer reading the number?

    --

    Rick C.

    -- Get 1,000 miles of free Supercharging
    -- Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rick C@21:1/5 to Zbig on Mon Jul 25 09:02:39 2022
    On Monday, July 25, 2022 at 4:53:17 AM UTC-4, Zbig wrote:
    BTW: I think if „space” is too difficult to use it as „thousand >separator”,
    ignored by Forth, I got a better „candidate”: Vertical Tab (0Bh): >— it's practically unused anywhere
    — it could be entered with, say, Shift-Space
    — it could be displayed as, guess what, just a single space
    Terrible bad idea, because it can be visually discerned from a space.
    Tab is not a glyph, but a control of mechanical type writers.
    Mechanical type writers aren't used (since very long time) anymore,
    so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.

    That is true, but a thousands separator is not one of them because it's not a printable character. If it can't be printed, no human can easily discern it in the character stream without a neural connection to a magnetometer.

    --

    Rick C.

    - Get 1,000 miles of free Supercharging
    - Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Mon Jul 25 09:10:22 2022
    Like this:
    284 985 000 234,23

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Mon Jul 25 09:07:56 2022
    The entire point of a thousands separator is to facilitate humans
    reading large numbers or small fractions.

    Speaking for myself: I use space for this, and it works for me.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Zbig on Mon Jul 25 17:02:11 2022
    Zbig <zbigniew2011@gmail.com> writes:
    284 985 000 234,23

    Great. So how should a Forth text interpreter know that this is one
    number, not four? And you should a human reading this as Forth code
    know that?

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rick C@21:1/5 to Zbig on Mon Jul 25 10:02:02 2022
    On Monday, July 25, 2022 at 12:10:23 PM UTC-4, Zbig wrote:
    Like this:
    284 985 000 234,23

    Ok, but that would not be seen by Forth at a single number. Do you get that's what the thread is about? Finding a way of notating thousand separators that is both machine readable and human recognizable? Or maybe I've missed the point?

    --

    Rick C.

    -+ Get 1,000 miles of free Supercharging
    -+ Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Mon Jul 25 14:29:03 2022
    Great. So how should a Forth text interpreter know that this is one
    number, not four? And you should a human reading this as Forth code
    know that?

    That's why I proposed VT for that. The operator, by pressing Shift-Space inserts VT between _groups_ of digits of the single number.
    On the screen it looks like „ordinary” spaces — exactly, like in case of „ordinary space” and „non-breakable space” (in case of text editor).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Mon Jul 25 15:03:39 2022
    Actually employing VT could have another advantage: consider all
    these „hyphenated words”. They wouldn't have to be hyphenated
    any longer. Instead of „pseudo space” VT could „link” two strings
    that comprise such word — making it look more natural.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Zbig on Tue Jul 26 13:16:19 2022
    On 26/07/2022 08:03, Zbig wrote:
    Actually employing VT could have another advantage: consider all
    these „hyphenated words”. They wouldn't have to be hyphenated
    any longer. Instead of „pseudo space” VT could „link” two strings that comprise such word — making it look more natural.

    A word-processor too ...

    Is there anything Forth can't do? :)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to dxforth on Tue Jul 26 13:41:26 2022
    On 25/07/2022 14:07, dxforth wrote:

    : strip ( c-addr u -- c-addr2 u2 )
    <# 2dup 1- over + do
    i c@ [char] _ over - if hold else drop then
    -1 +loop #> ;

    Won't work on zero-length strings but irrelevant here.

    Cheaper and without the quirk:

    : strip ( c-addr u -- c-addr2 u2 )
    <# begin dup while
    1- 2dup + c@ [char] _ over - if hold else drop then
    repeat #> ;

    In DX-Forth cheaper still is:

    : strip ( c-addr u -- c-addr2 u2 )
    <# begin dup while
    1- 2dup + c@ [char] _ of else hold then
    repeat #> ;

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rick C@21:1/5 to Zbig on Tue Jul 26 00:05:46 2022
    On Monday, July 25, 2022 at 6:03:41 PM UTC-4, Zbig wrote:
    Actually employing VT could have another advantage: consider all
    these „hyphenated words”. They wouldn't have to be hyphenated
    any longer. Instead of „pseudo space” VT could „link” two strings that comprise such word — making it look more natural.

    So you want to limit the ability to write Forth code to the use of special editors, custom designed for this Forth?

    Why can't you see the issues this would cause???

    There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.

    001 002 003 004

    --

    Rick C.

    +- Get 1,000 miles of free Supercharging
    +- Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Zbig on Tue Jul 26 06:57:32 2022
    Zbig <zbigniew2011@gmail.com> writes:
    Great. So how should a Forth text interpreter know that this is one=20
    number, not four? And you should a human reading this as Forth code=20
    know that?

    That's why I proposed VT for that. The operator, by pressing Shift-Space >inserts VT between _groups_ of digits of the single number.
    On the screen it looks like =E2=80=9Eordinary=E2=80=9D spaces =E2=80=94 exa= >ctly, like in case of
    =E2=80=9Eordinary space=E2=80=9D and =E2=80=9Enon-breakable space=E2=80=9D = >(in case of text editor).

    So how should a human reading this as Forth code know that

    284 985 000 234,23

    is one number, not four.

    Apart from that, reality check:

    Here's what is displayed by xterm for a VT:

    s\" 123\v456" cr type
    123
    456 ok

    And here's what xterm gives me when I input a Shift-Space:

    key cr .
    32 ok

    That's not the ASCII code for vt, it's the ASCII code for Space.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Zbig on Tue Jul 26 07:05:31 2022
    Zbig <zbigniew2011@gmail.com> writes:
    Actually employing VT could have another advantage: consider all
    these =E2=80=9Ehyphenated words=E2=80=9D. They wouldn't have to be hyphena= >ted
    any longer. Instead of =E2=80=9Epseudo space=E2=80=9D VT could =E2=80=9Elin= >k=E2=80=9D two strings
    that comprise such word =E2=80=94 making it look more natural.

    Again, how should a human see the difference between

    unused-words

    and

    unused words

    if you replace the "-" by something that looks like a space?

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Tue Jul 26 02:51:30 2022
    So you want to limit the ability to write Forth code to the use of special editors, custom designed for this Forth?

    No.

    Why can't you see the issues this would cause???

    What issues — in particular?

    There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.

    001 002 003 004

    It depends, whether the groups od digits are separated by space — or „connected” by VT.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Tue Jul 26 02:55:08 2022
    Again, how should a human see the difference between

    unused-words

    and

    unused words

    if you replace the "-" by something that looks like a space?

    Sometimes it may create a problem indeed, but taking a peek
    into glossary usually should help.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to zbigniew2011@gmail.com on Tue Jul 26 12:37:43 2022
    In article <d0a77c03-31f3-4ed7-a94a-f908fa7c4c7fn@googlegroups.com>,
    Zbig <zbigniew2011@gmail.com> wrote:
    Again, how should a human see the difference between

    unused-words

    and

    unused words

    if you replace the "-" by something that looks like a space?

    Sometimes it may create a problem indeed, but taking a peek
    into glossary usually should help.

    Seriously?
    It make as much sense as for Republicans to ban condoms,
    because they want less abortions.

    Groetjes
    --
    "in our communism country Viet Nam, people are forced to be
    alive and in the western country like US, people are free to
    die from Covid 19 lol" duc ha
    albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rick C@21:1/5 to Zbig on Tue Jul 26 09:24:51 2022
    On Tuesday, July 26, 2022 at 5:51:32 AM UTC-4, Zbig wrote:
    So you want to limit the ability to write Forth code to the use of special editors, custom designed for this Forth?
    No.
    Why can't you see the issues this would cause???
    What issues — in particular?
    There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.

    001 002 003 004
    It depends, whether the groups od digits are separated by space — or „connected” by VT.

    That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!

    Why can't you grasp this fail?

    --

    Rick C.

    ++ Get 1,000 miles of free Supercharging
    ++ Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Tue Jul 26 09:31:34 2022
    There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.

    001 002 003 004
    It depends, whether the groups od digits are separated by space — or „connected” by VT.
    That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!

    Why can't you grasp this fail?

    1. You wrote about text interpreter -- did you mean 'human' of Forth?
    Forth won't have any problem, it'll find VT there.

    2. If you mean human: if you want the others to understand you, you
    have to be precise in your statements. So it's enough to separate two
    numbers with TWO (or more) spaces, while keeping the groups of digits „connected” still with SINGLE VT (shown as single space).

    I honestly don't understand why are you put so much effort into creating problem out of nothing. You want to be properly understood? Be precise,
    that's all.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rick C@21:1/5 to Zbig on Tue Jul 26 13:06:33 2022
    On Tuesday, July 26, 2022 at 12:31:35 PM UTC-4, Zbig wrote:
    There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.

    001 002 003 004
    It depends, whether the groups od digits are separated by space — or „connected” by VT.
    That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!

    Why can't you grasp this fail?
    1. You wrote about text interpreter -- did you mean 'human' of Forth?
    Forth won't have any problem, it'll find VT there.

    Yes, but you then had to ask what I typed, showing the short coming, that a human can't tell. That was my point... unless you are not a human after all.


    2. If you mean human: if you want the others to understand you, you
    have to be precise in your statements. So it's enough to separate two numbers with TWO (or more) spaces, while keeping the groups of digits „connected” still with SINGLE VT (shown as single space).

    Ok, how many spaces did I type to separate these digits?

    0123 4567 8901 2345


    I honestly don't understand why are you put so much effort into creating problem out of nothing. You want to be properly understood? Be precise, that's all.

    Yes, you don't understand. That's the point.

    --

    Rick C.

    --- Get 1,000 miles of free Supercharging
    --- Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@arcor.de@21:1/5 to gnuarm.del...@gmail.com on Tue Jul 26 13:51:24 2022
    gnuarm.del...@gmail.com schrieb am Dienstag, 26. Juli 2022 um 22:06:34 UTC+2:
    Ok, how many spaces did I type to separate these digits?

    0123 4567 8901 2345

    At least there is a space between N and 7 in this geo coordinate: 38°17′10″N 76°24′42″W

    Very helpful. ;o)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@arcor.de@21:1/5 to gnuarm.del...@gmail.com on Tue Jul 26 15:09:23 2022
    gnuarm.del...@gmail.com schrieb am Mittwoch, 27. Juli 2022 um 00:02:55 UTC+2:
    On Tuesday, July 26, 2022 at 4:51:25 PM UTC-4, minf...@arcor.de wrote:
    gnuarm.del...@gmail.com schrieb am Dienstag, 26. Juli 2022 um 22:06:34 UTC+2:
    Ok, how many spaces did I type to separate these digits?

    0123 4567 8901 2345
    At least there is a space between N and 7 in this geo coordinate: 38°17′10″N 76°24′42″W

    Very helpful. ;o)
    So if you had a few spaces (not vertical tabs) in your coordinate, 38° 17′ 10″ N 76° 24′ 42″ W, I believe Forth would read the number 38, then treat ° as a word, no? I suppose ' would be a problem, since that is already in use. ", however,
    is not in use, so that would be good. I suppose if you were looking for coordinates in text, you could redefine ' for a bit, then restore it to mean "tick". Or do I not understand how numbers are read?

    The issue is that real world number inputs often require a real parser.
    Single hidden or visible separators won't do the job.

    Modern Forths offer recognizers or s.th.similar to do it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Tue Jul 26 15:14:32 2022
    There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.

    001 002 003 004
    It depends, whether the groups od digits are separated by space — or „connected” by VT.
    That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!

    Why can't you grasp this fail?
    1. You wrote about text interpreter -- did you mean 'human' of Forth? Forth won't have any problem, it'll find VT there.
    Yes, but you then had to ask what I typed, showing the short coming, that a human can't tell. That was my point... unless you are not a human after all.

    If you write something like this: 001_002_003 004 -- I'll also have to ask you a question, what actually you typed.
    It doesn't depend on the selected separator character.

    2. If you mean human: if you want the others to understand you, you
    have to be precise in your statements. So it's enough to separate two numbers with TWO (or more) spaces, while keeping the groups of digits „connected” still with SINGLE VT (shown as single space).
    Ok, how many spaces did I type to separate these digits?

    0123 4567 8901 2345

    Maybe now it's the time for me to ask a question — you have already made
    a fair use out of your question quota: does your Forth interpreter — and/or your computer screen — „compress” spaces like Google News interface?
    Or it doesn't?

    I honestly don't understand why are you put so much effort into creating problem out of nothing. You want to be properly understood? Be precise, that's all.
    Yes, you don't understand. That's the point.

    Never understood the people that insist on looking for the problems where
    there aren't any. I'm not a psychologist, you know, so I don't have to.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rick C@21:1/5 to minf...@arcor.de on Tue Jul 26 15:02:54 2022
    On Tuesday, July 26, 2022 at 4:51:25 PM UTC-4, minf...@arcor.de wrote:
    gnuarm.del...@gmail.com schrieb am Dienstag, 26. Juli 2022 um 22:06:34 UTC+2:
    Ok, how many spaces did I type to separate these digits?

    0123 4567 8901 2345
    At least there is a space between N and 7 in this geo coordinate: 38°17′10″N 76°24′42″W

    Very helpful. ;o)

    So if you had a few spaces (not vertical tabs) in your coordinate, 38° 17′ 10″ N 76° 24′ 42″ W, I believe Forth would read the number 38, then treat ° as a word, no? I suppose ' would be a problem, since that is already in use. ", however,
    is not in use, so that would be good. I suppose if you were looking for coordinates in text, you could redefine ' for a bit, then restore it to mean "tick". Or do I not understand how numbers are read?

    --

    Rick C.

    --+ Get 1,000 miles of free Supercharging
    --+ Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Tue Jul 26 15:27:03 2022
    Ok, how many spaces did I type to separate these digits?

    0123 4567 8901 2345

    Maybe this will explain some more to you:

    DPUSH: PUSH DX
    APUSH: PUSH AX
    NEXT: LODSW
    MOV BX,AX
    NEXT1: MOV DX,BX
    INC DX
    JMP [BX]

    Pretty deformatted, right?
    So I guess you'll insist on using underscore by macroassemblers,
    instead of spaces and tabs — while I don't care.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rick C@21:1/5 to Zbig on Tue Jul 26 20:13:25 2022
    On Tuesday, July 26, 2022 at 6:14:33 PM UTC-4, Zbig wrote:
    There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.

    001 002 003 004
    It depends, whether the groups od digits are separated by space — or „connected” by VT.
    That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!

    Why can't you grasp this fail?
    1. You wrote about text interpreter -- did you mean 'human' of Forth? Forth won't have any problem, it'll find VT there.
    Yes, but you then had to ask what I typed, showing the short coming, that a human can't tell. That was my point... unless you are not a human after all.
    If you write something like this: 001_002_003 004 -- I'll also have to ask you
    a question, what actually you typed.
    It doesn't depend on the selected separator character.

    I don't follow. If the convention in use is to separate thousands with the underscore, it is clear what the numbers are, 1002003 and 4. I don't follow your thinking here.


    2. If you mean human: if you want the others to understand you, you
    have to be precise in your statements. So it's enough to separate two numbers with TWO (or more) spaces, while keeping the groups of digits „connected” still with SINGLE VT (shown as single space).
    Ok, how many spaces did I type to separate these digits?

    0123 4567 8901 2345
    Maybe now it's the time for me to ask a question — you have already made
    a fair use out of your question quota: does your Forth interpreter — and/or
    your computer screen — „compress” spaces like Google News interface? Or it doesn't?

    It has been more than once I've copied programs from Google Groups. I also use a text editor that will replace spaces with tab characters when the alignment is right. That's why I mentioned previously that special editors would be needed. I've seen
    few editors that will allow you to enter a vertical tab character.


    I honestly don't understand why are you put so much effort into creating problem out of nothing. You want to be properly understood? Be precise, that's all.
    Yes, you don't understand. That's the point.
    Never understood the people that insist on looking for the problems where there aren't any. I'm not a psychologist, you know, so I don't have to.

    Your "solution" is a problem in solution clothing.

    --

    Rick C.

    -+- Get 1,000 miles of free Supercharging
    -+- Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Zbig on Wed Jul 27 14:22:46 2022
    On 27/07/2022 02:31, Zbig wrote:

    I honestly don't understand why are you put so much effort into creating problem out of nothing.

    My thoughts too. Underscore in numbers as a *programmer* convenience is
    on the increase and causes no compatibility issue in Forth (AFAIK).
    The only control characters I ever want to see in source code is line-ends
    and TABs. I'd rather not have to deal with TABs but I'll put up with them.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jan Coombs@21:1/5 to Zbig on Mon Aug 1 11:42:18 2022
    XPost: Jan Coombs <jan4comp.lang.forth@murray-microft.co.uk>

    On Mon, 25 Jul 2022 14:29:03 -0700 (PDT)
    Zbig <zbigniew2011@gmail.com> wrote:

    Great. So how should a Forth text interpreter know that this is one number, not four? And you should a human reading this as Forth code
    know that?

    That's why I proposed VT for that. The operator, by pressing Shift-Space inserts VT between _groups_ of digits of the single number.
    On the screen it looks like „ordinary” spaces — exactly, like in case of
    „ordinary space” and „non-breakable space” (in case of text editor).

    Entering shift-space into gforth and python3:

    $ gforth
    Gforth 0.7.9_20201217
    Authors: Anton Ertl, Bernd Paysan, Jens Wilke et al., for more type `authors' Copyright © 2019 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html> Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
    Type `help' for basic help
    key . 32 ok
    ekey . 32 ok


    $ python3
    Python 3.7.3 (default, Jan 22 2021, 20:04:44)
    [GCC 8.3.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    i=input(); i, ord(i)
    (' ', 32)



    So it seems that more work is involved in demonstrating this proposal.

    Jan Coombs

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jan Coombs@21:1/5 to Zbig on Mon Aug 1 11:51:56 2022
    On Mon, 25 Jul 2022 09:10:22 -0700 (PDT)
    Zbig <zbigniew2011@gmail.com> wrote:

    Like this:
    284 985 000 234,23

    or '284 985 000 234.23' depending on locale?

    '284_985_000_234,23' has fewer problems to resolve. Ugly as it might look, it is clearly one forth item.

    Jan Coombs

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Mon Aug 1 04:48:29 2022
    Like this:
    284 985 000 234,23
    or '284 985 000 234.23' depending on locale?

    '284_985_000_234,23' has fewer problems to resolve. Ugly as it might look, it is clearly one forth item.

    I was trying to explain, that there are EXACTLY THE SAME „problems
    to resolve” whether you connect the 3-digits groups with underscore,
    or with VT — but in latter case it just... looks better.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From S Jack@21:1/5 to All on Mon Aug 1 10:03:26 2022
    go
    ( note: DCX is alias for DECIMAL )
    --
    -- Formatted numeric output
    --
    +ofmt \ disable format of numeric output
    i. dcx 256 hex . ==> 100
    i. dcx 256. hex d. ==> 100
    i. dcx -1 hex . ==> -1
    i. dcx -1 hex u. ==> FFFFFFFF
    i. dcx -1. hex ud. ==> FFFFFFFFFFFFFFFF
    -ofmt \ enabled format of numeric output
    i. dcx 256 hex . ==> 0x0100
    i. dcx 256. hex d. ==> 0x0100
    i. dcx -1 hex . ==> -0x0001
    i. dcx -1 hex u. ==> 0xFFFF_FFFF
    i. dcx -1. hex ud. ==> 0xFFFF_FFFF_FFFF_FFFF
    --
    -- Comma, underscore, and/or period separators in numeric input
    --
    +ofmt
    dcx
    i. 123456789 . ==> 123456789
    i. 1,234,567,89 . ==> 123456789
    i. 1_234,567_89 . ==> 123456789
    i. 1,234_567,89 . ==> 123456789
    -ofmt
    i. 123456789 . ==> 123,456,789
    i. 1,234,567,89 . ==> 123,456,789
    i. 1_234,567_89 . ==> 123,456,789
    i. 1,234_567,89 . ==> 123,456,789
    +ofmt
    i. 1234567.89 d. ==> 123456789
    i. 1,234,567.89 d. ==> 123456789
    i. 1_234_567.89 d. ==> 123456789
    i. 1,234_567.89 d. ==> 123456789
    i. 1.234.567.89 d. ==> 123456789
    -ofmt
    i. 1234567.89 d. ==> 123,456,789
    i. 1,234,567.89 d. ==> 123,456,789
    i. 1_234_567.89 d. ==> 123,456,789
    i. 1,234_567.89 d. ==> 123,456,789
    i. 1.234.567.89 d. ==> 123,456,789
    +ofmt
    i. dcx 123456789 hex . ==> 75BCD15
    i. dcx 1,234,567,89 hex . ==> 75BCD15
    i. dcx 1_234,567_89 hex . ==> 75BCD15
    i. dcx 1,234_567,89 hex . ==> 75BCD15
    -ofmt
    i. dcx 123456789 hex . ==> 0x075B_CD15
    i. dcx 1,234,567,89 hex . ==> 0x075B_CD15
    i. dcx 1_234,567_89 hex . ==> 0x075B_CD15
    i. dcx 1,234_567,89 hex . ==> 0x075B_CD15
    +ofmt
    i. dcx 1234567.89 hex d. ==> 75BCD15
    i. dcx 1,234,567.89 hex d. ==> 75BCD15
    i. dcx 1_234_567.89 hex d. ==> 75BCD15
    i. dcx 1,234_567.89 hex d. ==> 75BCD15
    i. dcx 1.234.567.89 hex d. ==> 75BCD15
    -ofmt
    i. dcx 1234567.89 hex d. ==> 0x075B_CD15
    i. dcx 1,234,567.89 hex d. ==> 0x075B_CD15
    i. dcx 1_234_567.89 hex d. ==> 0x075B_CD15
    i. dcx 1,234_567.89 hex d. ==> 0x075B_CD15
    i. dcx 1.234.567.89 hex d. ==> 0x075B_CD15
    --
    -- Field and format
    --
    25 fld ! \ field width 25 characters
    +ofmt
    i. dcx 123456789 x. ==> 123456789
    i. dcx 123456789 hex x. ==> 75BCD15
    i. dcx 123456789. hex dx. ==> 75BCD15.
    i. dcx 12345.6789 hex dx. ==> 75B.CD15
    i. dcx -1 hex ux. ==> FFFFFFFF
    i. dcx -1. hex udx. ==> FFFFFFFFFFFFFFFF.
    -ofmt
    i. dcx 123456789 x. ==> 123,456,789
    i. dcx 123456789 hex x. ==> 0x075B_CD15
    i. dcx 123456789. hex dx. ==> 0x075B_CD15.
    i. dcx 12345.6789 hex dx. ==> 0x075B.CD15
    i. dcx -1 hex ux. ==> 0xFFFF_FFFF
    i. dcx -1. hex udx. ==> 0xFFFF_FFFF_FFFF_FFFF.

    -fin-
    --
    me

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rick C@21:1/5 to Zbig on Mon Aug 1 11:50:40 2022
    On Monday, August 1, 2022 at 7:48:30 AM UTC-4, Zbig wrote:
    Like this:
    284 985 000 234,23
    or '284 985 000 234.23' depending on locale?

    '284_985_000_234,23' has fewer problems to resolve. Ugly as it might look, it is clearly one forth item.
    I was trying to explain, that there are EXACTLY THE SAME „problems
    to resolve” whether you connect the 3-digits groups with underscore,
    or with VT — but in latter case it just... looks better.

    There are two HUGE differences in the two proposals. When you use an underscore, every editor in the world will work with that, while some editors may not work with the VT character. The other is that when looking at code, humans can't tell the
    difference between multiple numbers and a single number. Since VT gives the appearance of a space, there's no way for a human to tell what is in the code.

    This would make Forth the ultimate write only language.

    --

    Rick C.

    -++ Get 1,000 miles of free Supercharging
    -++ Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From P Falth@21:1/5 to dxforth on Mon Aug 22 06:59:52 2022
    On Saturday, 23 July 2022 at 08:24:09 UTC+2, dxforth wrote:
    32/64-bit machines have increased the risk of entering numbers incorrectly. Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?

    Usual suspects pre-answered.

    Q. Why the underscore character?
    A. It's not one of the characters Forth Inc uses to denote a double number. It's increasingly used in programming languages for this purpose. Even
    XPL0 has it.

    A. ANS didn't see the need for it.
    Q. Are you married?

    Q. Should >NUMBER process the underscore?
    A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.

    Q. Then you'll need a routine to strip the underscores and a temporary buffer
    to hold the result. What do you suggest?
    A. The HOLD buffer.

    Q. Won't it interfere with numeric output?
    A. Input/output are usually mutually exclusive.

    Q. Won't the HOLD buffer need to be larger to hold the punctuation?
    A. Assuming worst case and one underscore per 4 characters, 20% larger.

    Q. Is all this just c.l.f. speculation - or have you implemented it?
    A. Implemented

    Q. Has it broken anything?
    A. Not AFAIK

    Q. What did it cost?
    A. 34 bytes on 8086, 39 bytes on 8080

    Q. Can't it be done using recognizers?
    A. If so, probably at more cost.

    Q. Will you keep it?
    A. Good question. For 16-bit integers its value may be marginal. How often do you enter values in binary?

    I got interested in this suggestion and implemented it.
    I thought the underscore was a bit ugly so implemented a word to set the grouping char

    : SET-GROUPING-CHAR ( xchar --)
    0 grping !
    dup 32 > and grping xc!+ drop ;

    I also set the grouping different based on BASE.
    Decimal and octal group 3 digits
    Hex 4 and binary 8.

    After that I started testing different chars. Today I use ´ ( $B4 acute accent)
    I think that ties the numbers together while _ puts them apart

    123´456´789 ok.
    . 123´456´789 ok
    '_' set-grouping-char ok
    123_456_789 ok.
    . 123_456_789 ok

    I also tried out the space as suggested by Zbig but not using VT.
    At codepoint $A0 there is a non breaking space char

    $a0 set-grouping-char ok
    123456789 ok.
    . 123 456 789 ok

    it gets more difficult to input without remapping a key.
    ´ is nice as it is (on my Swedish keyboard) next to the + key on the top row no shift or alt key needed to input it.

    But using the non breaking space I can now make words with spaces in them!

    : Hej Peter ." Ciao Peter" ; ok
    Hej Peter Ciao Peter ok

    This of course looks even more confusing then spaces in numbers!

    For me this improves readability enormously! Thanks for the suggestion.

    BR
    Peter

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@arcor.de@21:1/5 to P Falth on Mon Aug 22 07:44:00 2022
    P Falth schrieb am Montag, 22. August 2022 um 15:59:54 UTC+2:
    On Saturday, 23 July 2022 at 08:24:09 UTC+2, dxforth wrote:
    32/64-bit machines have increased the risk of entering numbers incorrectly.
    Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?

    Usual suspects pre-answered.

    Q. Why the underscore character?
    A. It's not one of the characters Forth Inc uses to denote a double number.
    It's increasingly used in programming languages for this purpose. Even XPL0 has it.

    A. ANS didn't see the need for it.
    Q. Are you married?

    Q. Should >NUMBER process the underscore?
    A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.

    Q. Then you'll need a routine to strip the underscores and a temporary buffer
    to hold the result. What do you suggest?
    A. The HOLD buffer.

    Q. Won't it interfere with numeric output?
    A. Input/output are usually mutually exclusive.

    Q. Won't the HOLD buffer need to be larger to hold the punctuation?
    A. Assuming worst case and one underscore per 4 characters, 20% larger.

    Q. Is all this just c.l.f. speculation - or have you implemented it?
    A. Implemented

    Q. Has it broken anything?
    A. Not AFAIK

    Q. What did it cost?
    A. 34 bytes on 8086, 39 bytes on 8080

    Q. Can't it be done using recognizers?
    A. If so, probably at more cost.

    Q. Will you keep it?
    A. Good question. For 16-bit integers its value may be marginal. How often do you enter values in binary?

    I got interested in this suggestion and implemented it.
    I thought the underscore was a bit ugly so implemented a word to set the grouping char

    : SET-GROUPING-CHAR ( xchar --)
    0 grping !
    dup 32 > and grping xc!+ drop ;

    I also set the grouping different based on BASE.
    Decimal and octal group 3 digits
    Hex 4 and binary 8.

    After that I started testing different chars. Today I use ´ ( $B4 acute accent)
    I think that ties the numbers together while _ puts them apart

    123´456´789 ok.
    . 123´456´789 ok
    '_' set-grouping-char ok
    123_456_789 ok.
    . 123_456_789 ok

    I also tried out the space as suggested by Zbig but not using VT.
    At codepoint $A0 there is a non breaking space char

    $a0 set-grouping-char ok
    123456789 ok.
    . 123 456 789 ok

    it gets more difficult to input without remapping a key.
    ´ is nice as it is (on my Swedish keyboard) next to the + key on the top row
    no shift or alt key needed to input it.

    But using the non breaking space I can now make words with spaces in them!

    : Hej Peter ." Ciao Peter" ; ok
    Hej Peter Ciao Peter ok

    This of course looks even more confusing then spaces in numbers!

    For me this improves readability enormously! Thanks for the suggestion.


    Fine! I am just wondering if ´ ie $B4 is the same in most codepages/locales.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Zbig@21:1/5 to All on Mon Aug 22 10:34:08 2022
    : Hej Peter ." Ciao Peter" ; ok
    Hej Peter Ciao Peter ok

    This of course looks even more confusing then spaces in numbers!

    It may look confusing in your simplistic example — when you pasted it
    like this, indeed it's difficult to tell, what is Forth word, and what is an effect of its execution — still it doesn't have to look any confusing in
    real program / Forth screen.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From P Falth@21:1/5 to minf...@arcor.de on Mon Aug 22 13:50:53 2022
    On Monday, 22 August 2022 at 16:44:02 UTC+2, minf...@arcor.de wrote:
    P Falth schrieb am Montag, 22. August 2022 um 15:59:54 UTC+2:
    On Saturday, 23 July 2022 at 08:24:09 UTC+2, dxforth wrote:
    32/64-bit machines have increased the risk of entering numbers incorrectly.
    Should the Forth interpreter be allowed to ignore certain punctuation e.g.
    underscore in numbers? What would be the issues?

    Usual suspects pre-answered.

    Q. Why the underscore character?
    A. It's not one of the characters Forth Inc uses to denote a double number.
    It's increasingly used in programming languages for this purpose. Even XPL0 has it.

    A. ANS didn't see the need for it.
    Q. Are you married?

    Q. Should >NUMBER process the underscore?
    A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.

    Q. Then you'll need a routine to strip the underscores and a temporary buffer
    to hold the result. What do you suggest?
    A. The HOLD buffer.

    Q. Won't it interfere with numeric output?
    A. Input/output are usually mutually exclusive.

    Q. Won't the HOLD buffer need to be larger to hold the punctuation?
    A. Assuming worst case and one underscore per 4 characters, 20% larger.

    Q. Is all this just c.l.f. speculation - or have you implemented it?
    A. Implemented

    Q. Has it broken anything?
    A. Not AFAIK

    Q. What did it cost?
    A. 34 bytes on 8086, 39 bytes on 8080

    Q. Can't it be done using recognizers?
    A. If so, probably at more cost.

    Q. Will you keep it?
    A. Good question. For 16-bit integers its value may be marginal. How often
    do you enter values in binary?

    I got interested in this suggestion and implemented it.
    I thought the underscore was a bit ugly so implemented a word to set the grouping char

    : SET-GROUPING-CHAR ( xchar --)
    0 grping !
    dup 32 > and grping xc!+ drop ;

    I also set the grouping different based on BASE.
    Decimal and octal group 3 digits
    Hex 4 and binary 8.

    After that I started testing different chars. Today I use ´ ( $B4 acute accent)
    I think that ties the numbers together while _ puts them apart

    123´456´789 ok.
    . 123´456´789 ok
    '_' set-grouping-char ok
    123_456_789 ok.
    . 123_456_789 ok

    I also tried out the space as suggested by Zbig but not using VT.
    At codepoint $A0 there is a non breaking space char

    $a0 set-grouping-char ok
    123456789 ok.
    . 123 456 789 ok

    it gets more difficult to input without remapping a key.
    ´ is nice as it is (on my Swedish keyboard) next to the + key on the top row
    no shift or alt key needed to input it.

    But using the non breaking space I can now make words with spaces in them!

    : Hej Peter ." Ciao Peter" ; ok
    Hej Peter Ciao Peter ok

    This of course looks even more confusing then spaces in numbers!

    For me this improves readability enormously! Thanks for the suggestion.

    Fine! I am just wondering if ´ ie $B4 is the same in most codepages/locales.

    My systems require input to be utf8 encoded Unicode and will output utf8 streams.
    It has worked for over 20 years like that on both Windows and Linux.
    ´at $B4 is present in Windows 1252 and Linux Latin 1 codepages.
    Is there any reason to not use Unicode and utf8 today on Windows and Linux?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to P Falth on Tue Aug 23 13:21:30 2022
    On 23/08/2022 06:50, P Falth wrote:
    ...
    My systems require input to be utf8 encoded Unicode and will output utf8 streams.
    It has worked for over 20 years like that on both Windows and Linux.
    ´at $B4 is present in Windows 1252 and Linux Latin 1 codepages.
    Is there any reason to not use Unicode and utf8 today on Windows and Linux?

    String literals and comment fields excepted, there's not a lot of reason to
    use UTF-8 in programming code.

    Underscore in numbers is about convention. Several programming languages have adopted it as a programmer convenience. It might bemuse other languages to know Forth had no problem giving comma et al new meanings but drew the line at underscore.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to peter.m.falth@gmail.com on Tue Aug 23 12:23:18 2022
    In article <8619d421-e8b5-4075-8dec-3813a60f6f8cn@googlegroups.com>,
    P Falth <peter.m.falth@gmail.com> wrote:
    On Monday, 22 August 2022 at 16:44:02 UTC+2, minf...@arcor.de wrote:
    P Falth schrieb am Montag, 22. August 2022 um 15:59:54 UTC+2:
    On Saturday, 23 July 2022 at 08:24:09 UTC+2, dxforth wrote:
    32/64-bit machines have increased the risk of entering numbers incorrectly.
    Should the Forth interpreter be allowed to ignore certain punctuation e.g.
    underscore in numbers? What would be the issues?

    Usual suspects pre-answered.

    Q. Why the underscore character?
    A. It's not one of the characters Forth Inc uses to denote a double number.
    It's increasingly used in programming languages for this purpose. Even >> > > XPL0 has it.

    A. ANS didn't see the need for it.
    Q. Are you married?

    Q. Should >NUMBER process the underscore?
    A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.

    Q. Then you'll need a routine to strip the underscores and a temporary buffer
    to hold the result. What do you suggest?
    A. The HOLD buffer.

    Q. Won't it interfere with numeric output?
    A. Input/output are usually mutually exclusive.

    Q. Won't the HOLD buffer need to be larger to hold the punctuation?
    A. Assuming worst case and one underscore per 4 characters, 20% larger. >> > >
    Q. Is all this just c.l.f. speculation - or have you implemented it?
    A. Implemented

    Q. Has it broken anything?
    A. Not AFAIK

    Q. What did it cost?
    A. 34 bytes on 8086, 39 bytes on 8080

    Q. Can't it be done using recognizers?
    A. If so, probably at more cost.

    Q. Will you keep it?
    A. Good question. For 16-bit integers its value may be marginal. How often
    do you enter values in binary?

    I got interested in this suggestion and implemented it.
    I thought the underscore was a bit ugly so implemented a word to set the grouping char

    : SET-GROUPING-CHAR ( xchar --)
    0 grping !
    dup 32 > and grping xc!+ drop ;

    I also set the grouping different based on BASE.
    Decimal and octal group 3 digits
    Hex 4 and binary 8.

    After that I started testing different chars. Today I use ´ ( $B4 acute accent)
    I think that ties the numbers together while _ puts them apart

    123´456´789 ok.
    . 123´456´789 ok
    '_' set-grouping-char ok
    123_456_789 ok.
    . 123_456_789 ok

    I also tried out the space as suggested by Zbig but not using VT.
    At codepoint $A0 there is a non breaking space char

    $a0 set-grouping-char ok
    123456789 ok.
    . 123 456 789 ok

    it gets more difficult to input without remapping a key.
    ´ is nice as it is (on my Swedish keyboard) next to the + key on the top row
    no shift or alt key needed to input it.

    But using the non breaking space I can now make words with spaces in them! >> >
    : Hej Peter ." Ciao Peter" ; ok
    Hej Peter Ciao Peter ok

    This of course looks even more confusing then spaces in numbers!

    For me this improves readability enormously! Thanks for the suggestion.

    Fine! I am just wondering if ´ ie $B4 is the same in most codepages/locales.

    My systems require input to be utf8 encoded Unicode and will output utf8 streams.
    It has worked for over 20 years like that on both Windows and Linux.
    ´at $B4 is present in Windows 1252 and Linux Latin 1 codepages.
    Is there any reason to not use Unicode and utf8 today on Windows and Linux?

    There is a good reason to junk { BL WORD } in favor of TOKEN / NAME or whatever.

    NAME ( -- addr n ) get a blank surrounded token from the input stream
    with appropriate side effects on the input stream.

    The only requirements that looking up -- SEARCH-LIST -- has to follow is looking up this string, whatever its content.
    Then encoding of the characters shouldn't be a concern of the Forth system.
    In fact I have used escape sequences as Forth words. The action is what function keys that generate those codes are supposed to do.

    The talk about character encodings can be separated from dictionary and
    word lookups.

    Groetjes Albert
    --
    "in our communism country Viet Nam, people are forced to be
    alive and in the western country like US, people are free to
    die from Covid 19 lol" duc ha
    albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to albert@cherry. on Tue Aug 23 10:40:24 2022
    albert@cherry.(none) (albert) writes:
    There is a good reason to junk { BL WORD } in favor of TOKEN / NAME or whatever.

    NAME ( -- addr n ) get a blank surrounded token from the input stream
    with appropriate side effects on the input stream.

    PARSE-NAME has been standardized: <https://forth-standard.org/standard/core/PARSE-NAME>

    Then encoding of the characters shouldn't be a concern of the Forth system.

    UTF-8 worked nicely in the systems I tried that were not designed for
    it, with two exceptions: Editing on the command line did not work
    properly; and pointing out the error on a line did not work properly.
    Parsing worked fine.

    The virtue of UTF-8 is that it works well for most code that is
    written for handling ASCII, and that's what we see.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: https://euro.theforth.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bernd Linsel@21:1/5 to dxforth on Tue Aug 23 17:46:52 2022
    On 23.08.2022 05:21, dxforth wrote:
    On 23/08/2022 06:50, P Falth wrote:
    ... My systems require input to be utf8 encoded Unicode and will
    output utf8 streams. It has worked for over 20 years like that on
    both Windows and Linux. 'at $B4 is present in Windows 1252 and
    Linux Latin 1 codepages. Is there any reason to not use Unicode and
    utf8 today on Windows and Linux?

    String literals and comment fields excepted, there's not a lot of
    reason to use UTF-8 in programming code.

    Underscore in numbers is about convention. Several programming
    languages have adopted it as a programmer convenience. It might
    bemuse other languages to know Forth had no problem giving comma et
    al new meanings but drew the line at underscore.


    I really do like writing literals in sources in UTF-8, since my system
    fully supports it and has not the faintest will to use antiquated or
    strange things like CP1252, ISO-8859-xxx, UTF-16, but one gets quickly
    used to writing sources in ASCII with hex escapes again when
    collaborating with Windows people who are not willing or able to save
    edited files as UTF-8 and all your special characters (for me,
    especially measurement units containing characters like u+00B0
    (Degrees), u+00B5 (greek mu for micro prefix), u+202F (narrow no-break
    space between value and measurement unit) etc. are lost every time one
    of these moron^H^H^H^H^Hfolks changed something.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@arcor.de@21:1/5 to Bernd Linsel on Tue Aug 23 09:50:12 2022
    Bernd Linsel schrieb am Dienstag, 23. August 2022 um 17:46:57 UTC+2:
    On 23.08.2022 05:21, dxforth wrote:
    On 23/08/2022 06:50, P Falth wrote:
    ... My systems require input to be utf8 encoded Unicode and will
    output utf8 streams. It has worked for over 20 years like that on
    both Windows and Linux. 'at $B4 is present in Windows 1252 and
    Linux Latin 1 codepages. Is there any reason to not use Unicode and
    utf8 today on Windows and Linux?

    String literals and comment fields excepted, there's not a lot of
    reason to use UTF-8 in programming code.

    Underscore in numbers is about convention. Several programming
    languages have adopted it as a programmer convenience. It might
    bemuse other languages to know Forth had no problem giving comma et
    al new meanings but drew the line at underscore.

    I really do like writing literals in sources in UTF-8, since my system
    fully supports it and has not the faintest will to use antiquated or
    strange things like CP1252, ISO-8859-xxx, UTF-16, but one gets quickly
    used to writing sources in ASCII with hex escapes again when
    collaborating with Windows people who are not willing or able to save
    edited files as UTF-8 and all your special characters (for me,
    especially measurement units containing characters like u+00B0
    (Degrees), u+00B5 (greek mu for micro prefix), u+202F (narrow no-break
    space between value and measurement unit) etc. are lost every time one
    of these moron^H^H^H^H^Hfolks changed something.

    Not wanting to contradict, but lots of Forth programs run on small systems where UTF-8 is not present, even when the programs are developped on feature-rich desktops.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marcel Hendrix@21:1/5 to none albert on Tue Aug 23 10:11:53 2022
    On Tuesday, August 23, 2022 at 12:23:22 PM UTC+2, none albert wrote:
    [..]
    123麓456麓789 ok.
    . 123麓456麓789 ok
    [..]
    麓at $B4 is present in Windows 1252 and Linux Latin 1 codepages.
    Is there any reason to not use Unicode and utf8 today on Windows and Linux?

    According to the quoted stuff, quite a few :--)

    -marcel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to minf...@arcor.de on Wed Aug 24 07:38:19 2022
    "minf...@arcor.de" <minforth@arcor.de> writes:
    Not wanting to contradict, but lots of Forth programs run on small systems >where UTF-8 is not present, even when the programs are developped on >feature-rich desktops.

    UTF-8-encoded strings are just sequences of bytes for nearly all the
    code that deals with it. That's why it works so well for code that
    has been written for ASCII; that's also just bytes. So a small system
    has no problem dealing with UTF-8, and a statement like "UTF-8 is not
    present" makes little sense.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2022: https://euro.theforth.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)