Forum: >>> Magnum BBS <<<

Underscoring numbers in Forth

From dxforth@21:1/5 to All on Sat Jul 23 16:24:07 2022

32/64-bit machines have increased the risk of entering numbers incorrectly. Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?

Usual suspects pre-answered.

Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number.
It's increasingly used in programming languages for this purpose. Even
XPL0 has it.

A. ANS didn't see the need for it.
Q. Are you married?

Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.

Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.

Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.

Q. Won't the HOLD buffer need to be larger to hold the punctuation?
A. Assuming worst case and one underscore per 4 characters, 20% larger.

Q. Is all this just c.l.f. speculation - or have you implemented it?
A. Implemented

Q. Has it broken anything?
A. Not AFAIK

Q. What did it cost?
A. 34 bytes on 8086, 39 bytes on 8080

Q. Can't it be done using recognizers?
A. If so, probably at more cost.

Q. Will you keep it?
A. Good question. For 16-bit integers its value may be marginal. How often
do you enter values in binary?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to dxforth on Sat Jul 23 16:30:58 2022

On 23/07/2022 16:24, dxforth wrote:

Q. Won't the HOLD buffer need to be larger to hold the punctuation?
A. Assuming worst case and one underscore per 4 characters, 20% larger.

Q. Hang on - doesn't the buffer hold the _converted_ string?
A. Correct. The HOLD buffer doesn't need to be larger.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Marcel Hendrix@21:1/5 to dxforth on Sat Jul 23 01:02:26 2022

On Saturday, July 23, 2022 at 8:24:09 AM UTC+2, dxforth wrote:

32/64-bit machines have increased the risk of entering numbers incorrectly. Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?

Usual suspects pre-answered.

Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number. It's increasingly used in programming languages for this purpose. Even
XPL0 has it.

Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.

[..]

Q. Has it broken anything?
A. Not AFAIK

[..]

What exactly is your idea?

"... certain punctuation e.g. underscore ... "

I guess you are talking about integer single precision, i.e. you want
_1000, 1_000, 10_00, 100_0, 1000_, _1__0_0_0____ all to map to 1000
in the current BASE? This_is_dead_beef ?

When >NUMBER doesn't handle it, how does it get recognized as an
integer by the rest of the system? Why not have the application filter
it when it wants to support this?

-marcel

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From none) (albert@21:1/5 to mhx@iae.nl on Sat Jul 23 12:16:46 2022

In article <7ac8f1a1-c173-4dff-930f-2e29aa5990ccn@googlegroups.com>,
Marcel Hendrix <mhx@iae.nl> wrote:

On Saturday, July 23, 2022 at 8:24:09 AM UTC+2, dxforth wrote:

32/64-bit machines have increased the risk of entering numbers incorrectly. >> Should the Forth interpreter be allowed to ignore certain punctuation e.g. >> underscore in numbers? What would be the issues?

Usual suspects pre-answered.

Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number. >> It's increasingly used in programming languages for this purpose. Even
XPL0 has it.

Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker. >>

[..]

Q. Has it broken anything?
A. Not AFAIK

[..]

What exactly is your idea?

"... certain punctuation e.g. underscore ... "

I guess you are talking about integer single precision, i.e. you want
_1000, 1_000, 10_00, 100_0, 1000_, _1__0_0_0____ all to map to 1000
in the current BASE? This_is_dead_beef ?

When >NUMBER doesn't handle it, how does it get recognized as an
integer by the rest of the system? Why not have the application filter
it when it wants to support this?

NUMBER is carefully designed to be interruptable.

It could handle extra characters, e.g. a traditional use of
finding the place of the decimal point (for fixed point numbers).

0. "1111.1111" >NUMBER OVER C@ &. = IF OVER DPL ! /STRING THEN >NUMBER

Handling _ without changing >NUMBER, but yet using is, is left as an
exercise for the reader.

I admit that >NUMBER is a reasonable factor, but I don't care a bit
about the suggestion to use in a Forth kernel (political correct Forth).
So it is not used in ciforth, and could be relegated to a loadable extension.

-marcel

&. is a notation that replace '.' in
A decimal point in the middle of a word is non-standard.

Groetjes Albert
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to dxforth on Sat Jul 23 11:50:41 2022

dxforth <dxforth@gmail.com> writes:

32/64-bit machines have increased the risk of entering numbers incorrectly.

And reading the entered numbers. Who can tell quickly what order of
magnitude 1000000000000 has?

It's also about outputted numbers. Yes, I can write an output routine
that outputs 8_888_888_888_888 for readability, but if I cannot cut
that number and paste it back in (which I occasionally have to do), I
shy away from that.

Should the Forth interpreter be allowed to ignore certain punctuation e.g. >underscore in numbers?

It is allowed that already.

Usual suspects pre-answered.

Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number.
It's increasingly used in programming languages for this purpose. Even
XPL0 has it.

Very sensible. Who are you and what have you done to dxforth:-)

Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.

I don't see strong reasons either way.

Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.

No such buffer is needed. That's the beauty of >NUMBER, which has
been designed for a very similar use case:

: >number_ ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )
\ like >number, but ignores _
begin
>number
dup 0> while
over c@ '_' = while
1 /string
repeat then ;

Whould the buffer option be smaller?

: >number_ ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )
\ not tested or debugged
holdbuf >r
begin
over c@ dup digit? if
drop r> c!+ r>
else
'_' <> if
holdbuf r> over - 2swap 2>r >number 2drop 2r> exit then
again ;

Do you manage any better?

Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.

Says who?

Q. Can't it be done using recognizers?
A. If so, probably at more cost.

What makes you think so?

Q. Will you keep it?
A. Good question. For 16-bit integers its value may be marginal. How often
do you enter values in binary?

I have been thinking about adding this feature for a while. I expect
that I will do so at some point in the future.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Marcel Hendrix on Sun Jul 24 00:10:27 2022

On 23/07/2022 18:02, Marcel Hendrix wrote:

What exactly is your idea?

"... certain punctuation e.g. underscore ... "

I guess you are talking about integer single precision, i.e. you want
_1000, 1_000, 10_00, 100_0, 1000_, _1__0_0_0____ all to map to 1000
in the current BASE? This_is_dead_beef ?

Any character string representing a number sent to the forth interpreter.
The idea is to strip the underscores just before forth tries to convert
the string to a number. The catch is it mustn't be found in the dictionary which is classically searched first. This effectively means you can't
use underscore in a word name - or risk your number being found.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Sat Jul 23 07:28:33 2022

32/64-bit machines have increased the risk of entering numbers incorrectly. Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?

No such risk in case of underscore; to enter underscore character one has
to press Shift-Minus — it can be done only on purpose.
I believe one character that could be ignored the way you propose is space. When entering long numbers it may be comfortable to, for example, separate thousands by adding single space among them. It's easier to check the input before final Enter-press.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From minforth@arcor.de@21:1/5 to Zbig on Sat Jul 23 14:45:41 2022

Zbig schrieb am Samstag, 23. Juli 2022 um 16:28:34 UTC+2:

32/64-bit machines have increased the risk of entering numbers incorrectly.
Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?

No such risk in case of underscore; to enter underscore character one has
to press Shift-Minus — it can be done only on purpose.
I believe one character that could be ignored the way you propose is space. When entering long numbers it may be comfortable to, for example, separate thousands by adding single space among them. It's easier to check the input before final Enter-press.

You are timidly entering the gritty realm of locales ... ;-)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Sat Jul 23 15:37:41 2022

32/64-bit machines have increased the risk of entering numbers incorrectly.
Should the Forth interpreter be allowed to ignore certain punctuation e.g.
underscore in numbers? What would be the issues?

No such risk in case of underscore; to enter underscore character one has to press Shift-Minus — it can be done only on purpose.
I believe one character that could be ignored the way you propose is space.
When entering long numbers it may be comfortable to, for example, separate thousands by adding single space among them. It's easier to check the input
before final Enter-press.

You are timidly entering the gritty realm of locales ... ;-)

Not quite. I'm of course aware, that some countries use comma and dot for said "thousand separators", but both comma and dot characters are usually interpreted
as "double" mark in Forth. So only the space can be used as "separator".

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Anton Ertl on Sun Jul 24 13:22:03 2022

On 23/07/2022 21:50, Anton Ertl wrote:

dxforth <dxforth@gmail.com> writes:
...

Q. Then you'll need a routine to strip the underscores and a temporary buffer >> to hold the result. What do you suggest?
A. The HOLD buffer.

No such buffer is needed. That's the beauty of >NUMBER, which has
been designed for a very similar use case:

: >number_ ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )
\ like >number, but ignores _
begin
>number
dup 0> while
over c@ '_' = while
1 /string
repeat then ;

The idea was to avoid separate number converters.

Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.

Says who?

Humans - who use the same mouth to eat and speak.

Q. Can't it be done using recognizers?
A. If so, probably at more cost.

What makes you think so?

The 30 odd bytes I spent would be hard to beat.

My implementation is sound enough. It's the potential for underscored
numbers to collide with dictionary entries that's the problem. 200x
character literals have the same issue but there the risk is manageable
since it involves strings of 3 characters only one of which is variable.
What comes from trying to import foreign ideas into Forth.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to dxforth on Sun Jul 24 14:08:36 2022

On 24/07/2022 13:22, dxforth wrote:

It's the potential for underscored
numbers to collide with dictionary entries that's the problem.

Collisions might be reduced sufficiently by requiring underscored
numbers begin with an underscore. Not fool-proof but then neither
were 200x character literals.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ron AARON@21:1/5 to dxforth on Sun Jul 24 07:26:43 2022

On 23/07/2022 9:24, dxforth wrote:

32/64-bit machines have increased the risk of entering numbers incorrectly. Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?

I implemented underscores-in-numbers a while back in 8th, at no
perceivable cost. Makes large numbers much easier to understand.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Ron AARON on Sun Jul 24 14:42:32 2022

On 24/07/2022 14:26, Ron AARON wrote:

On 23/07/2022 9:24, dxforth wrote:

32/64-bit machines have increased the risk of entering numbers incorrectly. >> Should the Forth interpreter be allowed to ignore certain punctuation e.g. >> underscore in numbers? What would be the issues?

I implemented underscores-in-numbers a while back in 8th, at no
perceivable cost. Makes large numbers much easier to understand.

What about dictionary collisions - or does 8th handle numbers differently?
Any class of number or just integers?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ron AARON@21:1/5 to dxforth on Sun Jul 24 08:15:28 2022

On 24/07/2022 7:42, dxforth wrote:

On 24/07/2022 14:26, Ron AARON wrote:

On 23/07/2022 9:24, dxforth wrote:

32/64-bit machines have increased the risk of entering numbers incorrectly. >>> Should the Forth interpreter be allowed to ignore certain punctuation e.g. >>> underscore in numbers? What would be the issues?

I implemented underscores-in-numbers a while back in 8th, at no
perceivable cost. Makes large numbers much easier to understand.

What about dictionary collisions - or does 8th handle numbers differently? Any class of number or just integers?

The dictionary is searched first, so : 123_456 ; will be found if
"123_456" is entered. Numbers are attempted to be parsed after words, so
it's possible to override e.g. "8" if you wanted to.

Any kind of number allows the underscore, including "big integers" and
"big floats". The underscore is simply ignored inside number parsing.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Stephen Pelc@21:1/5 to dxforth on Sun Jul 24 08:29:29 2022

On 24 Jul 2022 at 06:42:32 CEST, "dxforth" <dxforth@gmail.com> wrote:

I implemented underscores-in-numbers a while back in 8th, at no
perceivable cost. Makes large numbers much easier to understand.

What about dictionary collisions - or does 8th handle numbers differently? Any class of number or just integers?

Once you have decided that numbers should have an ignoreable character
you might as well replace all occurrences of that literal by a variable. Once you have a variable, you can now choose the ignoreable character at
run-time, e.g.
':' ign-char !

You can use a similar mechanism for the DP and FP separators. Since
a variable is larger than a byte, you can treat the variables as n-char
arrays in which any match satisfies. VFX has used this mechanism for
decades to allow users to have locale-sensitive DP and FP numbers.
Since we made this change there have been no whines from the
standards lawyers and no technical support issues.

Stephen
--
Stephen Pelc, stephen@vfxforth.com
MicroProcessor Engineering, Ltd. - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, +44 (0)78 0390 3612, +34 649 662 974 http://www.mpeforth.com - free VFX Forth downloads

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to dxforth on Sun Jul 24 09:35:47 2022

dxforth <dxforth@gmail.com> writes:

On 23/07/2022 21:50, Anton Ertl wrote:

dxforth <dxforth@gmail.com> writes:
...

Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.

No such buffer is needed. That's the beauty of >NUMBER, which has
been designed for a very similar use case:

: >number_ ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )
\ like >number, but ignores _
begin
>number
dup 0> while
over c@ '_' = while
1 /string
repeat then ;

The idea was to avoid separate number converters.

I have no idea what you mean with that.

Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.

Says who?

Humans - who use the same mouth to eat and speak.

And the relevance to conversion from strings to numbers and numbers to
strings is?

Q. Can't it be done using recognizers?
A. If so, probably at more cost.

What makes you think so?

The 30 odd bytes I spent would be hard to beat.

Moving the goalposts? I did not ask about beating.

Why makes you think that the cost would be higher rather than just the
same if one applies the same change to a pluggable number recognizer
rather than a hardwired one?

It's the potential for underscored
numbers to collide with dictionary entries that's the problem.

That's no problem, just like the potential for other numbers to
collide with dictionary entries is no problem:

Dictionary entries are searched first, so if you have a word _ or __
or _1 or 1_ etc., it will be found before the number recognizer tries
to convert it into a number. The conventional way to avoid a number
being shadowed by a dictionary entry is to start the number with one
of the digits 0-9 (and avoiding dictionary entries that start with
these digits, albeit there are some exceptions that prove this rule).

200x character literals have the same issue

It's the same non-issue for the same reason. And, guess what, no
problems have been reported to us, neither for the Forth-2012
character literals ('a', implemented in Gforth since 0.7 (2008)), nor
for Gforth's older syntax ('a, implemented in Gforth since the first
public release (1996)).

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Stephen Pelc on Sun Jul 24 10:05:23 2022

Stephen Pelc <stephen@vfxforth.com> writes:

Once you have decided that numbers should have an ignoreable character
you might as well replace all occurrences of that literal by a variable. Once >you have a variable, you can now choose the ignoreable character at
run-time, e.g.
':' ign-char !

That may be a way for vendors to placate their customers if they all
want some different ignore-character, but if you want a common
language for exchanging libraries, studying programs etc, it's a bad
idea. And given that no other viable ignore-characters apart from _
has been proposed (I don't consider space, comma, and dot to be a
viable ignore-characters in Forth) despite the frequent urge to
bike-shed such small changes to death, why propose this misfeature in
your first posting on this topic?

Your example is especially nasty because ':' is a double indicator in SwiftForth. So someone following your suggestion would produce
programs that behave quite differently in SwiftForth.

You can use a similar mechanism for the DP and FP separators.

Also bad ideas for language commonality. If you want to accept
decimal comma, accept it in addition to the decimal point. No need
for variables.

Since we made this change there have been [...] no technical support issues.

If library authors made use of this misfeature, and an application
author would trip over that, would you get a support call? I guess,
though, that library authors are smart enough to stay clear of it.
But if a "feature" is best avoided, why provide it at all?

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From none) (albert@21:1/5 to zbigniew2011@gmail.com on Sun Jul 24 14:03:28 2022

In article <034d9af5-154a-431d-a469-f7054b9c0bb1n@googlegroups.com>,
Zbig <zbigniew2011@gmail.com> wrote:

32/64-bit machines have increased the risk of entering numbers incorrectly. >> Should the Forth interpreter be allowed to ignore certain punctuation e.g. >> underscore in numbers? What would be the issues?

No such risk in case of underscore; to enter underscore character one has
to press Shift-Minus — it can be done only on purpose.
I believe one character that could be ignored the way you propose is space. >When entering long numbers it may be comfortable to, for example, separate >thousands by adding single space among them. It's easier to check the input >before final Enter-press.

Using spaces in numbers? In Forth this is a bad idea.
Underscores, yes.

Groetjes Albert
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Sun Jul 24 05:33:31 2022

Using spaces in numbers? In Forth this is a bad idea.
Underscores, yes.

Who, apart of Forth programmer, will use underscore when entering
any number?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Zbig on Sun Jul 24 12:53:48 2022

Zbig <zbigniew2011@gmail.com> writes:

Who, apart of Forth programmer, will use underscore when entering
any number?

Accoding to
<https://en.wikipedia.org/wiki/Decimal_separator#Digit_grouping>:

|maritime "21_450"

and (more relevant):

|Ada, C# (from version 7.0[34]), D, Haskell (from GHC version 8.6.1),
|Java, Kotlin,[35] OCaml, Perl, Python (from version 3.6), PHP (from
|version 7.4[36]), Ruby, Go (from version 1.13), Rust, Julia, and
|Swift use the underscore (_) character for this purpose

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Sun Jul 24 06:19:15 2022

Who, apart of Forth programmer, will use underscore when entering
any number?

Accoding to <https://en.wikipedia.org/wiki/Decimal_separator#Digit_grouping>:

|maritime "21_450"

Correction: apart of Forth programmer and a sailor.

and (more relevant):

|Ada, C# (from version 7.0[34]), D, Haskell (from GHC version 8.6.1),
|Java, Kotlin,[35] OCaml, Perl, Python (from version 3.6), PHP (from |version 7.4[36]), Ruby, Go (from version 1.13), Rust, Julia, and
|Swift use the underscore (_) character for this purpose

OK, so I'm asking the same question to creators of Ada, C# (from version 7.0[34]), D, Haskell (from GHC version 8.6.1), Java, Kotlin,[35] OCaml, Perl, Python (from version 3.6), PHP (from version 7.4[36]), Ruby, Go (from
version 1.13), Rust, Julia, and Swift: who, apart of the sailors and apart
of the programmers, that were told "use underscore" — indeed uses
underscore when entering any numbers?

The question is serious; I never saw anyone, who was using underscore
to enter number — well, maybe indeed it's commonly used somewhere
for that purpose (like I had no idea some sailors use that).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ron AARON@21:1/5 to Anton Ertl on Sun Jul 24 16:26:54 2022

On 24/07/2022 15:53, Anton Ertl wrote:

Zbig <zbigniew2011@gmail.com> writes:

Who, apart of Forth programmer, will use underscore when entering
any number?

Accoding to
<https://en.wikipedia.org/wiki/Decimal_separator#Digit_grouping>:

|maritime "21_450"

and (more relevant):

|Ada, C# (from version 7.0[34]), D, Haskell (from GHC version 8.6.1),
|Java, Kotlin,[35] OCaml, Perl, Python (from version 3.6), PHP (from
|version 7.4[36]), Ruby, Go (from version 1.13), Rust, Julia, and
|Swift use the underscore (_) character for this purpose

- anton

Indeed; it was because someone asked for it based on Python's example,
that I did eventually add it into 8th.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Anton Ertl on Mon Jul 25 00:02:25 2022

On 24/07/2022 19:35, Anton Ertl wrote:

dxforth <dxforth@gmail.com> writes:

On 23/07/2022 21:50, Anton Ertl wrote:

dxforth <dxforth@gmail.com> writes:
...

Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.

No such buffer is needed. That's the beauty of >NUMBER, which has
been designed for a very similar use case:

: >number_ ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )
\ like >number, but ignores _
begin
>number
dup 0> while
over c@ '_' = while
1 /string
repeat then ;

The idea was to avoid separate number converters.

I have no idea what you mean with that.

You just created one. Will you create another for floats?

Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.

Says who?

Humans - who use the same mouth to eat and speak.

And the relevance to conversion from strings to numbers and numbers to strings is?

I see no reason for them to collide.

Q. Can't it be done using recognizers?
A. If so, probably at more cost.

What makes you think so?

The 30 odd bytes I spent would be hard to beat.

Moving the goalposts? I did not ask about beating.

Why makes you think that the cost would be higher rather than just the
same if one applies the same change to a pluggable number recognizer
rather than a hardwired one?

Feel free to show the code for the plug-in.

It's the potential for underscored
numbers to collide with dictionary entries that's the problem.

That's no problem, just like the potential for other numbers to
collide with dictionary entries is no problem:

Dictionary entries are searched first, so if you have a word _ or __
or _1 or 1_ etc., it will be found before the number recognizer tries
to convert it into a number. The conventional way to avoid a number
being shadowed by a dictionary entry is to start the number with one
of the digits 0-9 (and avoiding dictionary entries that start with
these digits, albeit there are some exceptions that prove this rule).

Fair enough

200x character literals have the same issue

It's the same non-issue for the same reason. And, guess what, no
problems have been reported to us, neither for the Forth-2012
character literals ('a', implemented in Gforth since 0.7 (2008)), nor
for Gforth's older syntax ('a, implemented in Gforth since the first
public release (1996)).

- anton

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Sun Jul 24 07:17:59 2022

|Swift use the underscore (_) character for this purpose

BTW: I think if „space” is too difficult to use it as „thousand separator”,
ignored by Forth, I got a better „candidate”: Vertical Tab (0Bh):
— it's practically unused anywhere
— it could be entered with, say, Shift-Space
— it could be displayed as, guess what, just a single space

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Stephen Pelc@21:1/5 to All on Sun Jul 24 14:28:17 2022

On 24 Jul 2022 at 12:05:23 CEST, "Anton Ertl" <Anton Ertl> wrote:

Your response is a typical "not invented here" response.

The DP and FP character definitions solve a *real* issue in that the
Forth standard approach cannot be used for real-world data entry.
The DP and FP char solution allows the double and FP data entry
routines to be used for data entry in various locales.

Your example is especially nasty because ':' is a double indicator in SwiftForth. So someone following your suggestion would produce
programs that behave quite differently in SwiftForth.

I have a dispute resolution protocol in a contract (yes, really) that includes the line:
"Dispute resolution processes include the consumption of alcoholic
beverages, food and laughter."
Leon at Forth Inc and I are perfectly capable of finding a resolution.

Also bad ideas for language commonality. If you want to accept
decimal comma, accept it in addition to the decimal point. No need
for variables.

I think that you do not understand locales.

Stephen

--
Stephen Pelc, stephen@vfxforth.com
MicroProcessor Engineering, Ltd. - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, +44 (0)78 0390 3612, +34 649 662 974 http://www.mpeforth.com - free VFX Forth downloads

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to dxforth on Sun Jul 24 14:29:47 2022

dxforth <dxforth@gmail.com> writes:

On 24/07/2022 19:35, Anton Ertl wrote:

dxforth <dxforth@gmail.com> writes:

The idea was to avoid separate number converters.

I have no idea what you mean with that.

You just created one. Will you create another for floats?

I have no such plans. But now I know what you mean.

And the relevance to conversion from strings to numbers and numbers to
strings is?

I see no reason for them to collide.

I do: Putting debugging output in string->number conversion words.

Q. Can't it be done using recognizers?
A. If so, probably at more cost.

What makes you think so?

The 30 odd bytes I spent would be hard to beat.

Moving the goalposts? I did not ask about beating.

Why makes you think that the cost would be higher rather than just the
same if one applies the same change to a pluggable number recognizer
rather than a hardwired one?

Feel free to show the code for the plug-in.

This won't help at all, because it is not changed:

: rec-num ( addr u -- n/d table | notfound ) \ gforth-experimental
\G converts a number to a single/double integer
snumber? dup
IF
0> IF ['] recognized-dnum ELSE ['] recognized-num THEN EXIT
THEN
drop ['] notfound ;

The change is in s>unumber?, which is called (with one intermediate)
by snumber?. Both snumber? and s>unumber? already exist in
gforth-0.7, i.e., before recognizers. And the change consists of
replacing a call to >NUMBER with a call to >NUMBER_.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Stephen Pelc on Sun Jul 24 15:27:24 2022

Stephen Pelc <stephen@vfxforth.com> writes:

On 24 Jul 2022 at 12:05:23 CEST, "Anton Ertl" <Anton Ertl> wrote:

Your response is a typical "not invented here" response.

Starting out with name-calling is a typical defense of someone who
does not have convincing arguments to support his position.

The DP and FP character definitions solve a *real* issue in that the
Forth standard approach cannot be used for real-world data entry.
The DP and FP char solution allows the double and FP data entry
routines to be used for data entry in various locales.

One question is if the source code should be subject to the current
locale. This would mean that when using the program in another
locale, it would have to be changed.

All programming languages I know of define their source code
independent of the locale. The exception was Algol 60, which did not
define the computer representation of their source code at all.

User input is generally something different from source code, although
in interactive languages there can be some overlap. This has not
caused interactive languages to change their source code
interpretation depending on locale.

Your example is especially nasty because ':' is a double indicator in
SwiftForth. So someone following your suggestion would produce
programs that behave quite differently in SwiftForth.

I have a dispute resolution protocol in a contract (yes, really) that includes >the line:
"Dispute resolution processes include the consumption of alcoholic
beverages, food and laughter."
Leon at Forth Inc and I are perfectly capable of finding a resolution.

Your suggestion might be picked up by some programmer who then
produces a program, and some other user may pick the source code up
and use it in SwiftForth, and waste quite a bit of time trying to find
out what's wrong. I don't think your dispute resolution protocol is
going to help him.

Also bad ideas for language commonality. If you want to accept
decimal comma, accept it in addition to the decimal point. No need
for variables.

I think that you do not understand locales.

Elucidate me!

I just looked at the 358 locale files in /usr/share/i18n/locales on my
Debian 11 installation, and found three decimal_point and
mon_decimal_point values:

.
,
<U066B> (in fa_IR (Persian (Iran)) and ps_AF (Pashto (Afghanistan)))

The latter is an extended character that takes two bytes in UTF-8, so
your variable approach cannot deal with it (if I understand it
correctly).

So treating both '.' and ',' as double-defining characters (what
SwiftForth does) covers all the locales that your variable approach
can cover.

For thousands_sep the variants are:

""
","
"."
<U066C> (Arabic Thousands Separator)
<U2019> (Right Single Quotation Mark)
<U202F> (Narrow No-Break Space)

I have now added a locale PROG that uses _ as thousands separator, so
I can do things like

LC_NUMERIC=PROG.utf8 perf stat true

and it outputs lines like

754_957 cycles # 3.667 GHz
562_046 instructions # 0.74 insn per cycle
112_617 branches # 546.987 M/sec
4_530 branch-misses # 4.02% of all branches

and I can then paste these numbers into Forth and compute with them.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Anton Ertl on Mon Jul 25 02:28:21 2022

On 25/07/2022 00:29, Anton Ertl wrote:

dxforth <dxforth@gmail.com> writes:

On 24/07/2022 19:35, Anton Ertl wrote:

dxforth <dxforth@gmail.com> writes:

The idea was to avoid separate number converters.

I have no idea what you mean with that.

You just created one. Will you create another for floats?

I have no such plans. But now I know what you mean.

And the relevance to conversion from strings to numbers and numbers to
strings is?

I see no reason for them to collide.

I do: Putting debugging output in string->number conversion words.

That's akin to typing:

BL WORD name FIND

and expecting it to work. Forth at fault - or the operator for not understanding his tools?

Why makes you think that the cost would be higher rather than just the
same if one applies the same change to a pluggable number recognizer
rather than a hardwired one?

Feel free to show the code for the plug-in.

This won't help at all, because it is not changed:

: rec-num ( addr u -- n/d table | notfound ) \ gforth-experimental
\G converts a number to a single/double integer
snumber? dup
IF
0> IF ['] recognized-dnum ELSE ['] recognized-num THEN EXIT
THEN
drop ['] notfound ;

The change is in s>unumber?, which is called (with one intermediate)
by snumber?. Both snumber? and s>unumber? already exist in
gforth-0.7, i.e., before recognizers. And the change consists of
replacing a call to >NUMBER with a call to >NUMBER_.

Wanting to handle all numbers, my hardwired solution was:

; strip ( c-addr u c-addr2 -- c-addr3 u3 )

hdr x,'STRIP',,1
strip: pop di
pop cx
pop bx
add bx,cx ; start at end
sub dx,dx
strip1: jcxz strip3
dec bx ; builds down
mov al,[bx]
cmp al,'_'
jz strip2
dec di
mov [di],al
inc dx
strip2: dec cx
jmp strip1
strip3: push di
push dx
nextt
; 28 bytes

\ forth number interpreter
: number ( c-addr -- n|d|r xt )
count

pad \ *new* HOLD buffer end+1
strip \ *new* move string to HOLD buffer sans underscore
\ 4 bytes

If someone finds the strategy I employed is broken, I'll take my licks.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to dxforth on Mon Jul 25 14:07:48 2022

On 25/07/2022 02:28, dxforth wrote:

Wanting to handle all numbers, my hardwired solution was:

; strip ( c-addr u c-addr2 -- c-addr3 u3 )

hdr x,'STRIP',,1
strip: pop di
pop cx
pop bx
add bx,cx ; start at end
sub dx,dx
strip1: jcxz strip3
dec bx ; builds down
mov al,[bx]
cmp al,'_'
jz strip2
dec di
mov [di],al
inc dx
strip2: dec cx
jmp strip1
strip3: push di
push dx
nextt

I've replaced the assembler routine above as there was no check for overflow. Stack effects have changed as HOLD buffer is automatically referenced.

: strip ( c-addr u -- c-addr2 u2 )
<# 2dup 1- over + do
i c@ [char] _ over - if hold else drop then
-1 +loop #> ;

Won't work on zero-length strings but irrelevant here.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From none) (albert@21:1/5 to zbigniew2011@gmail.com on Mon Jul 25 10:33:06 2022

In article <b087485d-393c-4e22-acf0-a98d20301fben@googlegroups.com>,
Zbig <zbigniew2011@gmail.com> wrote:

|Swift use the underscore (_) character for this purpose

BTW: I think if „space” is too difficult to use it as „thousand >separator”,
ignored by Forth, I got a better „candidate”: Vertical Tab (0Bh):
— it's practically unused anywhere
— it could be entered with, say, Shift-Space
— it could be displayed as, guess what, just a single space

Terrible bad idea, because it can be visually discerned from a space.
Tab is not a glyph, but a control of mechanical type writers.

Groetjes Albert
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Mon Jul 25 01:53:15 2022

BTW: I think if „space” is too difficult to use it as „thousand >separator”,
ignored by Forth, I got a better „candidate”: Vertical Tab (0Bh):
— it's practically unused anywhere
— it could be entered with, say, Shift-Space
— it could be displayed as, guess what, just a single space

Terrible bad idea, because it can be visually discerned from a space.
Tab is not a glyph, but a control of mechanical type writers.

Mechanical type writers aren't used (since very long time) anymore,
so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ron AARON@21:1/5 to Zbig on Mon Jul 25 12:22:28 2022

On 25/07/2022 11:53, Zbig wrote:

BTW: I think if „space” is too difficult to use it as „thousand
separator”,
ignored by Forth, I got a better „candidate”: Vertical Tab (0Bh):
— it's practically unused anywhere
— it could be entered with, say, Shift-Space
— it could be displayed as, guess what, just a single space

Terrible bad idea, because it can be visually discerned from a space.
Tab is not a glyph, but a control of mechanical type writers.

Mechanical type writers aren't used (since very long time) anymore,
so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.

True enough; but the point that you can't see it unless a special font
is used, is a valid one.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Mon Jul 25 02:28:17 2022

I got a better „candidate”: Vertical Tab (0Bh):
— it's practically unused anywhere
— it could be entered with, say, Shift-Space
— it could be displayed as, guess what, just a single space

Terrible bad idea, because it can be visually discerned from a space.
Tab is not a glyph, but a control of mechanical type writers.

Mechanical type writers aren't used (since very long time) anymore,
so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.

True enough; but the point that you can't see it unless a special font
is used, is a valid one.

Maybe my assumption is different, but actually I don't see any need to make
it visible. I treat it as kind of „hard space” („non-breakable”) used sometimes
in text editors. I see its supposed invisibility rather as advantage.
Of course if for some particular reasons that character should be visible, underscore
may be good enough.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rick C@21:1/5 to Zbig on Mon Jul 25 09:05:29 2022

On Monday, July 25, 2022 at 5:28:18 AM UTC-4, Zbig wrote:

I got a better „candidate”: Vertical Tab (0Bh):
— it's practically unused anywhere
— it could be entered with, say, Shift-Space
— it could be displayed as, guess what, just a single space

Terrible bad idea, because it can be visually discerned from a space. >> Tab is not a glyph, but a control of mechanical type writers.

Mechanical type writers aren't used (since very long time) anymore,
so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.

True enough; but the point that you can't see it unless a special font
is used, is a valid one.

Maybe my assumption is different, but actually I don't see any need to make it visible. I treat it as kind of „hard space” („non-breakable”) used sometimes
in text editors. I see its supposed invisibility rather as advantage.
Of course if for some particular reasons that character should be visible, underscore
may be good enough.

I don't follow this. The entire point of a thousands separator is to facilitate humans reading large numbers or small fractions. Wouldn't this separator be ignored by any computer reading the number?

--

Rick C.

-- Get 1,000 miles of free Supercharging
-- Tesla referral code - https://ts.la/richard11209

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rick C@21:1/5 to Zbig on Mon Jul 25 09:02:39 2022

On Monday, July 25, 2022 at 4:53:17 AM UTC-4, Zbig wrote:

BTW: I think if „space” is too difficult to use it as „thousand >separator”,
ignored by Forth, I got a better „candidate”: Vertical Tab (0Bh): >— it's practically unused anywhere
— it could be entered with, say, Shift-Space
— it could be displayed as, guess what, just a single space

Terrible bad idea, because it can be visually discerned from a space.
Tab is not a glyph, but a control of mechanical type writers.

Mechanical type writers aren't used (since very long time) anymore,
so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.

That is true, but a thousands separator is not one of them because it's not a printable character. If it can't be printed, no human can easily discern it in the character stream without a neural connection to a magnetometer.

--

Rick C.

- Get 1,000 miles of free Supercharging
- Tesla referral code - https://ts.la/richard11209

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Mon Jul 25 09:10:22 2022

Like this:
284 985 000 234,23

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Mon Jul 25 09:07:56 2022

The entire point of a thousands separator is to facilitate humans
reading large numbers or small fractions.

Speaking for myself: I use space for this, and it works for me.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Zbig on Mon Jul 25 17:02:11 2022

Zbig <zbigniew2011@gmail.com> writes:

284 985 000 234,23

Great. So how should a Forth text interpreter know that this is one
number, not four? And you should a human reading this as Forth code
know that?

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rick C@21:1/5 to Zbig on Mon Jul 25 10:02:02 2022

On Monday, July 25, 2022 at 12:10:23 PM UTC-4, Zbig wrote:

Like this:
284 985 000 234,23

Ok, but that would not be seen by Forth at a single number. Do you get that's what the thread is about? Finding a way of notating thousand separators that is both machine readable and human recognizable? Or maybe I've missed the point?

--

Rick C.

-+ Get 1,000 miles of free Supercharging
-+ Tesla referral code - https://ts.la/richard11209

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Mon Jul 25 14:29:03 2022

Great. So how should a Forth text interpreter know that this is one
number, not four? And you should a human reading this as Forth code
know that?

That's why I proposed VT for that. The operator, by pressing Shift-Space inserts VT between _groups_ of digits of the single number.
On the screen it looks like „ordinary” spaces — exactly, like in case of „ordinary space” and „non-breakable space” (in case of text editor).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Mon Jul 25 15:03:39 2022

Actually employing VT could have another advantage: consider all
these „hyphenated words”. They wouldn't have to be hyphenated
any longer. Instead of „pseudo space” VT could „link” two strings
that comprise such word — making it look more natural.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Zbig on Tue Jul 26 13:16:19 2022

On 26/07/2022 08:03, Zbig wrote:

Actually employing VT could have another advantage: consider all
these „hyphenated words”. They wouldn't have to be hyphenated
any longer. Instead of „pseudo space” VT could „link” two strings that comprise such word — making it look more natural.

A word-processor too ...

Is there anything Forth can't do? :)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to dxforth on Tue Jul 26 13:41:26 2022

On 25/07/2022 14:07, dxforth wrote:

: strip ( c-addr u -- c-addr2 u2 )
<# 2dup 1- over + do
i c@ [char] _ over - if hold else drop then
-1 +loop #> ;

Won't work on zero-length strings but irrelevant here.

Cheaper and without the quirk:

: strip ( c-addr u -- c-addr2 u2 )
<# begin dup while
1- 2dup + c@ [char] _ over - if hold else drop then
repeat #> ;

In DX-Forth cheaper still is:

: strip ( c-addr u -- c-addr2 u2 )
<# begin dup while
1- 2dup + c@ [char] _ of else hold then
repeat #> ;

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rick C@21:1/5 to Zbig on Tue Jul 26 00:05:46 2022

On Monday, July 25, 2022 at 6:03:41 PM UTC-4, Zbig wrote:

Actually employing VT could have another advantage: consider all
these „hyphenated words”. They wouldn't have to be hyphenated
any longer. Instead of „pseudo space” VT could „link” two strings that comprise such word — making it look more natural.

So you want to limit the ability to write Forth code to the use of special editors, custom designed for this Forth?

Why can't you see the issues this would cause???

There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.

001 002 003 004

--

Rick C.

+- Get 1,000 miles of free Supercharging
+- Tesla referral code - https://ts.la/richard11209

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Zbig on Tue Jul 26 06:57:32 2022

Zbig <zbigniew2011@gmail.com> writes:

Great. So how should a Forth text interpreter know that this is one=20
number, not four? And you should a human reading this as Forth code=20
know that?

That's why I proposed VT for that. The operator, by pressing Shift-Space >inserts VT between _groups_ of digits of the single number.
On the screen it looks like =E2=80=9Eordinary=E2=80=9D spaces =E2=80=94 exa= >ctly, like in case of
=E2=80=9Eordinary space=E2=80=9D and =E2=80=9Enon-breakable space=E2=80=9D = >(in case of text editor).

So how should a human reading this as Forth code know that

284 985 000 234,23

is one number, not four.

Apart from that, reality check:

Here's what is displayed by xterm for a VT:

s\" 123\v456" cr type
123
456 ok

And here's what xterm gives me when I input a Shift-Space:

key cr .
32 ok

That's not the ASCII code for vt, it's the ASCII code for Space.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Zbig on Tue Jul 26 07:05:31 2022

Zbig <zbigniew2011@gmail.com> writes:

Actually employing VT could have another advantage: consider all
these =E2=80=9Ehyphenated words=E2=80=9D. They wouldn't have to be hyphena= >ted
any longer. Instead of =E2=80=9Epseudo space=E2=80=9D VT could =E2=80=9Elin= >k=E2=80=9D two strings
that comprise such word =E2=80=94 making it look more natural.

Again, how should a human see the difference between

unused-words

and

unused words

if you replace the "-" by something that looks like a space?

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Tue Jul 26 02:51:30 2022

So you want to limit the ability to write Forth code to the use of special editors, custom designed for this Forth?

No.

Why can't you see the issues this would cause???

What issues — in particular?

There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.

001 002 003 004

It depends, whether the groups od digits are separated by space — or „connected” by VT.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Tue Jul 26 02:55:08 2022

Again, how should a human see the difference between

unused-words

and

unused words

if you replace the "-" by something that looks like a space?

Sometimes it may create a problem indeed, but taking a peek
into glossary usually should help.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From none) (albert@21:1/5 to zbigniew2011@gmail.com on Tue Jul 26 12:37:43 2022

In article <d0a77c03-31f3-4ed7-a94a-f908fa7c4c7fn@googlegroups.com>,
Zbig <zbigniew2011@gmail.com> wrote:

Again, how should a human see the difference between

unused-words

and

unused words

if you replace the "-" by something that looks like a space?

Sometimes it may create a problem indeed, but taking a peek
into glossary usually should help.

Seriously?
It make as much sense as for Republicans to ban condoms,
because they want less abortions.

Groetjes
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rick C@21:1/5 to Zbig on Tue Jul 26 09:24:51 2022

On Tuesday, July 26, 2022 at 5:51:32 AM UTC-4, Zbig wrote:

So you want to limit the ability to write Forth code to the use of special editors, custom designed for this Forth?

No.

Why can't you see the issues this would cause???

What issues — in particular?

There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.

001 002 003 004

It depends, whether the groups od digits are separated by space — or „connected” by VT.

That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!

Why can't you grasp this fail?

--

Rick C.

++ Get 1,000 miles of free Supercharging
++ Tesla referral code - https://ts.la/richard11209

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Tue Jul 26 09:31:34 2022

There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.

001 002 003 004

It depends, whether the groups od digits are separated by space — or „connected” by VT.

That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!

Why can't you grasp this fail?

1. You wrote about text interpreter -- did you mean 'human' of Forth?
Forth won't have any problem, it'll find VT there.

2. If you mean human: if you want the others to understand you, you
have to be precise in your statements. So it's enough to separate two
numbers with TWO (or more) spaces, while keeping the groups of digits „connected” still with SINGLE VT (shown as single space).

I honestly don't understand why are you put so much effort into creating problem out of nothing. You want to be properly understood? Be precise,
that's all.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rick C@21:1/5 to Zbig on Tue Jul 26 13:06:33 2022

On Tuesday, July 26, 2022 at 12:31:35 PM UTC-4, Zbig wrote:

There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.

001 002 003 004

It depends, whether the groups od digits are separated by space — or „connected” by VT.

That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!

Why can't you grasp this fail?

1. You wrote about text interpreter -- did you mean 'human' of Forth?
Forth won't have any problem, it'll find VT there.

Yes, but you then had to ask what I typed, showing the short coming, that a human can't tell. That was my point... unless you are not a human after all.

2. If you mean human: if you want the others to understand you, you
have to be precise in your statements. So it's enough to separate two numbers with TWO (or more) spaces, while keeping the groups of digits „connected” still with SINGLE VT (shown as single space).

Ok, how many spaces did I type to separate these digits?

0123 4567 8901 2345

I honestly don't understand why are you put so much effort into creating problem out of nothing. You want to be properly understood? Be precise, that's all.

Yes, you don't understand. That's the point.

--

Rick C.

--- Get 1,000 miles of free Supercharging
--- Tesla referral code - https://ts.la/richard11209

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From minforth@arcor.de@21:1/5 to gnuarm.del...@gmail.com on Tue Jul 26 13:51:24 2022

gnuarm.del...@gmail.com schrieb am Dienstag, 26. Juli 2022 um 22:06:34 UTC+2:

Ok, how many spaces did I type to separate these digits?

0123 4567 8901 2345

At least there is a space between N and 7 in this geo coordinate: 38°17′10″N 76°24′42″W

Very helpful. ;o)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From minforth@arcor.de@21:1/5 to gnuarm.del...@gmail.com on Tue Jul 26 15:09:23 2022

gnuarm.del...@gmail.com schrieb am Mittwoch, 27. Juli 2022 um 00:02:55 UTC+2:

On Tuesday, July 26, 2022 at 4:51:25 PM UTC-4, minf...@arcor.de wrote:

gnuarm.del...@gmail.com schrieb am Dienstag, 26. Juli 2022 um 22:06:34 UTC+2:

Ok, how many spaces did I type to separate these digits?

0123 4567 8901 2345

At least there is a space between N and 7 in this geo coordinate: 38°17′10″N 76°24′42″W

Very helpful. ;o)

So if you had a few spaces (not vertical tabs) in your coordinate, 38° 17′ 10″ N 76° 24′ 42″ W, I believe Forth would read the number 38, then treat ° as a word, no? I suppose ' would be a problem, since that is already in use. ", however,

is not in use, so that would be good. I suppose if you were looking for coordinates in text, you could redefine ' for a bit, then restore it to mean "tick". Or do I not understand how numbers are read?

The issue is that real world number inputs often require a real parser.
Single hidden or visible separators won't do the job.

Modern Forths offer recognizers or s.th.similar to do it.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Tue Jul 26 15:14:32 2022

There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.

001 002 003 004

It depends, whether the groups od digits are separated by space — or „connected” by VT.

That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!

Why can't you grasp this fail?

1. You wrote about text interpreter -- did you mean 'human' of Forth? Forth won't have any problem, it'll find VT there.

Yes, but you then had to ask what I typed, showing the short coming, that a human can't tell. That was my point... unless you are not a human after all.

If you write something like this: 001_002_003 004 -- I'll also have to ask you a question, what actually you typed.
It doesn't depend on the selected separator character.

2. If you mean human: if you want the others to understand you, you
have to be precise in your statements. So it's enough to separate two numbers with TWO (or more) spaces, while keeping the groups of digits „connected” still with SINGLE VT (shown as single space).

Ok, how many spaces did I type to separate these digits?

0123 4567 8901 2345

Maybe now it's the time for me to ask a question — you have already made
a fair use out of your question quota: does your Forth interpreter — and/or your computer screen — „compress” spaces like Google News interface?
Or it doesn't?

I honestly don't understand why are you put so much effort into creating problem out of nothing. You want to be properly understood? Be precise, that's all.

Yes, you don't understand. That's the point.

Never understood the people that insist on looking for the problems where
there aren't any. I'm not a psychologist, you know, so I don't have to.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rick C@21:1/5 to minf...@arcor.de on Tue Jul 26 15:02:54 2022

On Tuesday, July 26, 2022 at 4:51:25 PM UTC-4, minf...@arcor.de wrote:

gnuarm.del...@gmail.com schrieb am Dienstag, 26. Juli 2022 um 22:06:34 UTC+2:

Ok, how many spaces did I type to separate these digits?

0123 4567 8901 2345

At least there is a space between N and 7 in this geo coordinate: 38°17′10″N 76°24′42″W

Very helpful. ;o)

So if you had a few spaces (not vertical tabs) in your coordinate, 38° 17′ 10″ N 76° 24′ 42″ W, I believe Forth would read the number 38, then treat ° as a word, no? I suppose ' would be a problem, since that is already in use. ", however,
is not in use, so that would be good. I suppose if you were looking for coordinates in text, you could redefine ' for a bit, then restore it to mean "tick". Or do I not understand how numbers are read?

--

Rick C.

--+ Get 1,000 miles of free Supercharging
--+ Tesla referral code - https://ts.la/richard11209

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Tue Jul 26 15:27:03 2022

Ok, how many spaces did I type to separate these digits?

0123 4567 8901 2345

Maybe this will explain some more to you:

DPUSH: PUSH DX
APUSH: PUSH AX
NEXT: LODSW
MOV BX,AX
NEXT1: MOV DX,BX
INC DX
JMP [BX]

Pretty deformatted, right?
So I guess you'll insist on using underscore by macroassemblers,
instead of spaces and tabs — while I don't care.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rick C@21:1/5 to Zbig on Tue Jul 26 20:13:25 2022

On Tuesday, July 26, 2022 at 6:14:33 PM UTC-4, Zbig wrote:

There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.

001 002 003 004

It depends, whether the groups od digits are separated by space — or „connected” by VT.

That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!

Why can't you grasp this fail?

1. You wrote about text interpreter -- did you mean 'human' of Forth? Forth won't have any problem, it'll find VT there.

Yes, but you then had to ask what I typed, showing the short coming, that a human can't tell. That was my point... unless you are not a human after all.

If you write something like this: 001_002_003 004 -- I'll also have to ask you
a question, what actually you typed.
It doesn't depend on the selected separator character.

I don't follow. If the convention in use is to separate thousands with the underscore, it is clear what the numbers are, 1002003 and 4. I don't follow your thinking here.

2. If you mean human: if you want the others to understand you, you
have to be precise in your statements. So it's enough to separate two numbers with TWO (or more) spaces, while keeping the groups of digits „connected” still with SINGLE VT (shown as single space).

Ok, how many spaces did I type to separate these digits?

0123 4567 8901 2345

Maybe now it's the time for me to ask a question — you have already made
a fair use out of your question quota: does your Forth interpreter — and/or
your computer screen — „compress” spaces like Google News interface? Or it doesn't?

It has been more than once I've copied programs from Google Groups. I also use a text editor that will replace spaces with tab characters when the alignment is right. That's why I mentioned previously that special editors would be needed. I've seen
few editors that will allow you to enter a vertical tab character.

I honestly don't understand why are you put so much effort into creating problem out of nothing. You want to be properly understood? Be precise, that's all.

Yes, you don't understand. That's the point.

Never understood the people that insist on looking for the problems where there aren't any. I'm not a psychologist, you know, so I don't have to.

Your "solution" is a problem in solution clothing.

--

Rick C.

-+- Get 1,000 miles of free Supercharging
-+- Tesla referral code - https://ts.la/richard11209

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Zbig on Wed Jul 27 14:22:46 2022

On 27/07/2022 02:31, Zbig wrote:

I honestly don't understand why are you put so much effort into creating problem out of nothing.

My thoughts too. Underscore in numbers as a *programmer* convenience is
on the increase and causes no compatibility issue in Forth (AFAIK).
The only control characters I ever want to see in source code is line-ends
and TABs. I'd rather not have to deal with TABs but I'll put up with them.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jan Coombs@21:1/5 to Zbig on Mon Aug 1 11:42:18 2022

XPost: Jan Coombs <jan4comp.lang.forth@murray-microft.co.uk>

On Mon, 25 Jul 2022 14:29:03 -0700 (PDT)
Zbig <zbigniew2011@gmail.com> wrote:

Great. So how should a Forth text interpreter know that this is one number, not four? And you should a human reading this as Forth code
know that?

That's why I proposed VT for that. The operator, by pressing Shift-Space inserts VT between _groups_ of digits of the single number.
On the screen it looks like „ordinary” spaces — exactly, like in case of
„ordinary space” and „non-breakable space” (in case of text editor).

Entering shift-space into gforth and python3:

$ gforth
Gforth 0.7.9_20201217
Authors: Anton Ertl, Bernd Paysan, Jens Wilke et al., for more type `authors' Copyright © 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html> Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `help' for basic help
key . 32 ok
ekey . 32 ok

$ python3
Python 3.7.3 (default, Jan 22 2021, 20:04:44)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

i=input(); i, ord(i)

(' ', 32)

So it seems that more work is involved in demonstrating this proposal.

Jan Coombs

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jan Coombs@21:1/5 to Zbig on Mon Aug 1 11:51:56 2022

On Mon, 25 Jul 2022 09:10:22 -0700 (PDT)
Zbig <zbigniew2011@gmail.com> wrote:

Like this:
284 985 000 234,23

or '284 985 000 234.23' depending on locale?

'284_985_000_234,23' has fewer problems to resolve. Ugly as it might look, it is clearly one forth item.

Jan Coombs

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Mon Aug 1 04:48:29 2022

Like this:
284 985 000 234,23

or '284 985 000 234.23' depending on locale?

'284_985_000_234,23' has fewer problems to resolve. Ugly as it might look, it is clearly one forth item.

I was trying to explain, that there are EXACTLY THE SAME „problems
to resolve” whether you connect the 3-digits groups with underscore,
or with VT — but in latter case it just... looks better.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From S Jack@21:1/5 to All on Mon Aug 1 10:03:26 2022

go
( note: DCX is alias for DECIMAL )
--
-- Formatted numeric output
--
+ofmt \ disable format of numeric output
i. dcx 256 hex . ==> 100
i. dcx 256. hex d. ==> 100
i. dcx -1 hex . ==> -1
i. dcx -1 hex u. ==> FFFFFFFF
i. dcx -1. hex ud. ==> FFFFFFFFFFFFFFFF
-ofmt \ enabled format of numeric output
i. dcx 256 hex . ==> 0x0100
i. dcx 256. hex d. ==> 0x0100
i. dcx -1 hex . ==> -0x0001
i. dcx -1 hex u. ==> 0xFFFF_FFFF
i. dcx -1. hex ud. ==> 0xFFFF_FFFF_FFFF_FFFF
--
-- Comma, underscore, and/or period separators in numeric input
--
+ofmt
dcx
i. 123456789 . ==> 123456789
i. 1,234,567,89 . ==> 123456789
i. 1_234,567_89 . ==> 123456789
i. 1,234_567,89 . ==> 123456789
-ofmt
i. 123456789 . ==> 123,456,789
i. 1,234,567,89 . ==> 123,456,789
i. 1_234,567_89 . ==> 123,456,789
i. 1,234_567,89 . ==> 123,456,789
+ofmt
i. 1234567.89 d. ==> 123456789
i. 1,234,567.89 d. ==> 123456789
i. 1_234_567.89 d. ==> 123456789
i. 1,234_567.89 d. ==> 123456789
i. 1.234.567.89 d. ==> 123456789
-ofmt
i. 1234567.89 d. ==> 123,456,789
i. 1,234,567.89 d. ==> 123,456,789
i. 1_234_567.89 d. ==> 123,456,789
i. 1,234_567.89 d. ==> 123,456,789
i. 1.234.567.89 d. ==> 123,456,789
+ofmt
i. dcx 123456789 hex . ==> 75BCD15
i. dcx 1,234,567,89 hex . ==> 75BCD15
i. dcx 1_234,567_89 hex . ==> 75BCD15
i. dcx 1,234_567,89 hex . ==> 75BCD15
-ofmt
i. dcx 123456789 hex . ==> 0x075B_CD15
i. dcx 1,234,567,89 hex . ==> 0x075B_CD15
i. dcx 1_234,567_89 hex . ==> 0x075B_CD15
i. dcx 1,234_567,89 hex . ==> 0x075B_CD15
+ofmt
i. dcx 1234567.89 hex d. ==> 75BCD15
i. dcx 1,234,567.89 hex d. ==> 75BCD15
i. dcx 1_234_567.89 hex d. ==> 75BCD15
i. dcx 1,234_567.89 hex d. ==> 75BCD15
i. dcx 1.234.567.89 hex d. ==> 75BCD15
-ofmt
i. dcx 1234567.89 hex d. ==> 0x075B_CD15
i. dcx 1,234,567.89 hex d. ==> 0x075B_CD15
i. dcx 1_234_567.89 hex d. ==> 0x075B_CD15
i. dcx 1,234_567.89 hex d. ==> 0x075B_CD15
i. dcx 1.234.567.89 hex d. ==> 0x075B_CD15
--
-- Field and format
--
25 fld ! \ field width 25 characters
+ofmt
i. dcx 123456789 x. ==> 123456789
i. dcx 123456789 hex x. ==> 75BCD15
i. dcx 123456789. hex dx. ==> 75BCD15.
i. dcx 12345.6789 hex dx. ==> 75B.CD15
i. dcx -1 hex ux. ==> FFFFFFFF
i. dcx -1. hex udx. ==> FFFFFFFFFFFFFFFF.
-ofmt
i. dcx 123456789 x. ==> 123,456,789
i. dcx 123456789 hex x. ==> 0x075B_CD15
i. dcx 123456789. hex dx. ==> 0x075B_CD15.
i. dcx 12345.6789 hex dx. ==> 0x075B.CD15
i. dcx -1 hex ux. ==> 0xFFFF_FFFF
i. dcx -1. hex udx. ==> 0xFFFF_FFFF_FFFF_FFFF.

-fin-
--
me

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rick C@21:1/5 to Zbig on Mon Aug 1 11:50:40 2022

On Monday, August 1, 2022 at 7:48:30 AM UTC-4, Zbig wrote:

Like this:
284 985 000 234,23

or '284 985 000 234.23' depending on locale?

'284_985_000_234,23' has fewer problems to resolve. Ugly as it might look, it is clearly one forth item.

I was trying to explain, that there are EXACTLY THE SAME „problems
to resolve” whether you connect the 3-digits groups with underscore,
or with VT — but in latter case it just... looks better.

There are two HUGE differences in the two proposals. When you use an underscore, every editor in the world will work with that, while some editors may not work with the VT character. The other is that when looking at code, humans can't tell the
difference between multiple numbers and a single number. Since VT gives the appearance of a space, there's no way for a human to tell what is in the code.

This would make Forth the ultimate write only language.

--

Rick C.

-++ Get 1,000 miles of free Supercharging
-++ Tesla referral code - https://ts.la/richard11209

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From P Falth@21:1/5 to dxforth on Mon Aug 22 06:59:52 2022

On Saturday, 23 July 2022 at 08:24:09 UTC+2, dxforth wrote:

32/64-bit machines have increased the risk of entering numbers incorrectly. Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?

Usual suspects pre-answered.

Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number. It's increasingly used in programming languages for this purpose. Even
XPL0 has it.

A. ANS didn't see the need for it.
Q. Are you married?

Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.

Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.

Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.

Q. Won't the HOLD buffer need to be larger to hold the punctuation?
A. Assuming worst case and one underscore per 4 characters, 20% larger.

Q. Is all this just c.l.f. speculation - or have you implemented it?
A. Implemented

Q. Has it broken anything?
A. Not AFAIK

Q. What did it cost?
A. 34 bytes on 8086, 39 bytes on 8080

Q. Can't it be done using recognizers?
A. If so, probably at more cost.

Q. Will you keep it?
A. Good question. For 16-bit integers its value may be marginal. How often do you enter values in binary?

I got interested in this suggestion and implemented it.
I thought the underscore was a bit ugly so implemented a word to set the grouping char

: SET-GROUPING-CHAR ( xchar --)
0 grping !
dup 32 > and grping xc!+ drop ;

I also set the grouping different based on BASE.
Decimal and octal group 3 digits
Hex 4 and binary 8.

After that I started testing different chars. Today I use ´ ( $B4 acute accent)
I think that ties the numbers together while _ puts them apart

123´456´789 ok.
. 123´456´789 ok
'_' set-grouping-char ok
123_456_789 ok.
. 123_456_789 ok

I also tried out the space as suggested by Zbig but not using VT.
At codepoint $A0 there is a non breaking space char

$a0 set-grouping-char ok
123456789 ok.
. 123 456 789 ok

it gets more difficult to input without remapping a key.
´ is nice as it is (on my Swedish keyboard) next to the + key on the top row no shift or alt key needed to input it.

But using the non breaking space I can now make words with spaces in them!

: Hej Peter ." Ciao Peter" ; ok
Hej Peter Ciao Peter ok

This of course looks even more confusing then spaces in numbers!

For me this improves readability enormously! Thanks for the suggestion.

BR
Peter

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From minforth@arcor.de@21:1/5 to P Falth on Mon Aug 22 07:44:00 2022

P Falth schrieb am Montag, 22. August 2022 um 15:59:54 UTC+2:

On Saturday, 23 July 2022 at 08:24:09 UTC+2, dxforth wrote:

32/64-bit machines have increased the risk of entering numbers incorrectly.
Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?

Usual suspects pre-answered.

Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number.
It's increasingly used in programming languages for this purpose. Even XPL0 has it.

A. ANS didn't see the need for it.
Q. Are you married?

Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.

Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.

Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.

Q. Won't the HOLD buffer need to be larger to hold the punctuation?
A. Assuming worst case and one underscore per 4 characters, 20% larger.

Q. Is all this just c.l.f. speculation - or have you implemented it?
A. Implemented

Q. Has it broken anything?
A. Not AFAIK

Q. What did it cost?
A. 34 bytes on 8086, 39 bytes on 8080

Q. Can't it be done using recognizers?
A. If so, probably at more cost.

Q. Will you keep it?
A. Good question. For 16-bit integers its value may be marginal. How often do you enter values in binary?

I got interested in this suggestion and implemented it.
I thought the underscore was a bit ugly so implemented a word to set the grouping char

: SET-GROUPING-CHAR ( xchar --)
0 grping !
dup 32 > and grping xc!+ drop ;

I also set the grouping different based on BASE.
Decimal and octal group 3 digits
Hex 4 and binary 8.

After that I started testing different chars. Today I use ´ ( $B4 acute accent)
I think that ties the numbers together while _ puts them apart

123´456´789 ok.
. 123´456´789 ok
'_' set-grouping-char ok
123_456_789 ok.
. 123_456_789 ok

I also tried out the space as suggested by Zbig but not using VT.
At codepoint $A0 there is a non breaking space char

$a0 set-grouping-char ok
123456789 ok.
. 123 456 789 ok

it gets more difficult to input without remapping a key.
´ is nice as it is (on my Swedish keyboard) next to the + key on the top row
no shift or alt key needed to input it.

But using the non breaking space I can now make words with spaces in them!

: Hej Peter ." Ciao Peter" ; ok
Hej Peter Ciao Peter ok

This of course looks even more confusing then spaces in numbers!

For me this improves readability enormously! Thanks for the suggestion.

Fine! I am just wondering if ´ ie $B4 is the same in most codepages/locales.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Zbig@21:1/5 to All on Mon Aug 22 10:34:08 2022

: Hej Peter ." Ciao Peter" ; ok
Hej Peter Ciao Peter ok

This of course looks even more confusing then spaces in numbers!

It may look confusing in your simplistic example — when you pasted it
like this, indeed it's difficult to tell, what is Forth word, and what is an effect of its execution — still it doesn't have to look any confusing in
real program / Forth screen.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From P Falth@21:1/5 to minf...@arcor.de on Mon Aug 22 13:50:53 2022

On Monday, 22 August 2022 at 16:44:02 UTC+2, minf...@arcor.de wrote:

P Falth schrieb am Montag, 22. August 2022 um 15:59:54 UTC+2:

On Saturday, 23 July 2022 at 08:24:09 UTC+2, dxforth wrote:

32/64-bit machines have increased the risk of entering numbers incorrectly.
Should the Forth interpreter be allowed to ignore certain punctuation e.g.
underscore in numbers? What would be the issues?

Usual suspects pre-answered.

Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number.
It's increasingly used in programming languages for this purpose. Even XPL0 has it.

A. ANS didn't see the need for it.
Q. Are you married?

Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.

Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.

Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.

Q. Won't the HOLD buffer need to be larger to hold the punctuation?
A. Assuming worst case and one underscore per 4 characters, 20% larger.

Q. Is all this just c.l.f. speculation - or have you implemented it?
A. Implemented

Q. Has it broken anything?
A. Not AFAIK

Q. What did it cost?
A. 34 bytes on 8086, 39 bytes on 8080

Q. Can't it be done using recognizers?
A. If so, probably at more cost.

Q. Will you keep it?
A. Good question. For 16-bit integers its value may be marginal. How often
do you enter values in binary?

I got interested in this suggestion and implemented it.
I thought the underscore was a bit ugly so implemented a word to set the grouping char

: SET-GROUPING-CHAR ( xchar --)
0 grping !
dup 32 > and grping xc!+ drop ;

I also set the grouping different based on BASE.
Decimal and octal group 3 digits
Hex 4 and binary 8.

After that I started testing different chars. Today I use ´ ( $B4 acute accent)
I think that ties the numbers together while _ puts them apart

123´456´789 ok.
. 123´456´789 ok
'_' set-grouping-char ok
123_456_789 ok.
. 123_456_789 ok

I also tried out the space as suggested by Zbig but not using VT.
At codepoint $A0 there is a non breaking space char

$a0 set-grouping-char ok
123456789 ok.
. 123 456 789 ok

it gets more difficult to input without remapping a key.
´ is nice as it is (on my Swedish keyboard) next to the + key on the top row
no shift or alt key needed to input it.

But using the non breaking space I can now make words with spaces in them!

: Hej Peter ." Ciao Peter" ; ok
Hej Peter Ciao Peter ok

This of course looks even more confusing then spaces in numbers!

For me this improves readability enormously! Thanks for the suggestion.

Fine! I am just wondering if ´ ie $B4 is the same in most codepages/locales.

My systems require input to be utf8 encoded Unicode and will output utf8 streams.
It has worked for over 20 years like that on both Windows and Linux.
´at $B4 is present in Windows 1252 and Linux Latin 1 codepages.
Is there any reason to not use Unicode and utf8 today on Windows and Linux?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to P Falth on Tue Aug 23 13:21:30 2022

On 23/08/2022 06:50, P Falth wrote:

...
My systems require input to be utf8 encoded Unicode and will output utf8 streams.
It has worked for over 20 years like that on both Windows and Linux.
´at $B4 is present in Windows 1252 and Linux Latin 1 codepages.
Is there any reason to not use Unicode and utf8 today on Windows and Linux?

String literals and comment fields excepted, there's not a lot of reason to
use UTF-8 in programming code.

Underscore in numbers is about convention. Several programming languages have adopted it as a programmer convenience. It might bemuse other languages to know Forth had no problem giving comma et al new meanings but drew the line at underscore.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From none) (albert@21:1/5 to peter.m.falth@gmail.com on Tue Aug 23 12:23:18 2022

In article <8619d421-e8b5-4075-8dec-3813a60f6f8cn@googlegroups.com>,
P Falth <peter.m.falth@gmail.com> wrote:

On Monday, 22 August 2022 at 16:44:02 UTC+2, minf...@arcor.de wrote:

P Falth schrieb am Montag, 22. August 2022 um 15:59:54 UTC+2:

On Saturday, 23 July 2022 at 08:24:09 UTC+2, dxforth wrote:

32/64-bit machines have increased the risk of entering numbers incorrectly.
Should the Forth interpreter be allowed to ignore certain punctuation e.g.
underscore in numbers? What would be the issues?

Usual suspects pre-answered.

Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number.
It's increasingly used in programming languages for this purpose. Even >> > > XPL0 has it.

A. ANS didn't see the need for it.
Q. Are you married?

Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.

Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.

Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.

Q. Won't the HOLD buffer need to be larger to hold the punctuation?
A. Assuming worst case and one underscore per 4 characters, 20% larger. >> > >
Q. Is all this just c.l.f. speculation - or have you implemented it?
A. Implemented

Q. Has it broken anything?
A. Not AFAIK

Q. What did it cost?
A. 34 bytes on 8086, 39 bytes on 8080

Q. Can't it be done using recognizers?
A. If so, probably at more cost.

Q. Will you keep it?
A. Good question. For 16-bit integers its value may be marginal. How often
do you enter values in binary?

I got interested in this suggestion and implemented it.
I thought the underscore was a bit ugly so implemented a word to set the grouping char

: SET-GROUPING-CHAR ( xchar --)
0 grping !
dup 32 > and grping xc!+ drop ;

I also set the grouping different based on BASE.
Decimal and octal group 3 digits
Hex 4 and binary 8.

After that I started testing different chars. Today I use ´ ( $B4 acute accent)
I think that ties the numbers together while _ puts them apart

123´456´789 ok.
. 123´456´789 ok
'_' set-grouping-char ok
123_456_789 ok.
. 123_456_789 ok

I also tried out the space as suggested by Zbig but not using VT.
At codepoint $A0 there is a non breaking space char

$a0 set-grouping-char ok
123456789 ok.
. 123 456 789 ok

it gets more difficult to input without remapping a key.
´ is nice as it is (on my Swedish keyboard) next to the + key on the top row
no shift or alt key needed to input it.

But using the non breaking space I can now make words with spaces in them! >> >
: Hej Peter ." Ciao Peter" ; ok
Hej Peter Ciao Peter ok

This of course looks even more confusing then spaces in numbers!

For me this improves readability enormously! Thanks for the suggestion.

Fine! I am just wondering if ´ ie $B4 is the same in most codepages/locales.

My systems require input to be utf8 encoded Unicode and will output utf8 streams.
It has worked for over 20 years like that on both Windows and Linux.
´at $B4 is present in Windows 1252 and Linux Latin 1 codepages.
Is there any reason to not use Unicode and utf8 today on Windows and Linux?

There is a good reason to junk { BL WORD } in favor of TOKEN / NAME or whatever.

NAME ( -- addr n ) get a blank surrounded token from the input stream
with appropriate side effects on the input stream.

The only requirements that looking up -- SEARCH-LIST -- has to follow is looking up this string, whatever its content.
Then encoding of the characters shouldn't be a concern of the Forth system.
In fact I have used escape sequences as Forth words. The action is what function keys that generate those codes are supposed to do.

The talk about character encodings can be separated from dictionary and
word lookups.

Groetjes Albert
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to albert@cherry. on Tue Aug 23 10:40:24 2022

albert@cherry.(none) (albert) writes:

There is a good reason to junk { BL WORD } in favor of TOKEN / NAME or whatever.

NAME ( -- addr n ) get a blank surrounded token from the input stream
with appropriate side effects on the input stream.

PARSE-NAME has been standardized: <https://forth-standard.org/standard/core/PARSE-NAME>

Then encoding of the characters shouldn't be a concern of the Forth system.

UTF-8 worked nicely in the systems I tried that were not designed for
it, with two exceptions: Editing on the command line did not work
properly; and pointing out the error on a line did not work properly.
Parsing worked fine.

The virtue of UTF-8 is that it works well for most code that is
written for handling ASCII, and that's what we see.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: https://euro.theforth.net

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Bernd Linsel@21:1/5 to dxforth on Tue Aug 23 17:46:52 2022

On 23.08.2022 05:21, dxforth wrote:

On 23/08/2022 06:50, P Falth wrote:

... My systems require input to be utf8 encoded Unicode and will
output utf8 streams. It has worked for over 20 years like that on
both Windows and Linux. 'at $B4 is present in Windows 1252 and
Linux Latin 1 codepages. Is there any reason to not use Unicode and
utf8 today on Windows and Linux?

String literals and comment fields excepted, there's not a lot of
reason to use UTF-8 in programming code.

Underscore in numbers is about convention. Several programming
languages have adopted it as a programmer convenience. It might
bemuse other languages to know Forth had no problem giving comma et
al new meanings but drew the line at underscore.

I really do like writing literals in sources in UTF-8, since my system
fully supports it and has not the faintest will to use antiquated or
strange things like CP1252, ISO-8859-xxx, UTF-16, but one gets quickly
used to writing sources in ASCII with hex escapes again when
collaborating with Windows people who are not willing or able to save
edited files as UTF-8 and all your special characters (for me,
especially measurement units containing characters like u+00B0
(Degrees), u+00B5 (greek mu for micro prefix), u+202F (narrow no-break
space between value and measurement unit) etc. are lost every time one
of these moron^H^H^H^H^Hfolks changed something.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From minforth@arcor.de@21:1/5 to Bernd Linsel on Tue Aug 23 09:50:12 2022

Bernd Linsel schrieb am Dienstag, 23. August 2022 um 17:46:57 UTC+2:

On 23.08.2022 05:21, dxforth wrote:

On 23/08/2022 06:50, P Falth wrote:

... My systems require input to be utf8 encoded Unicode and will
output utf8 streams. It has worked for over 20 years like that on
both Windows and Linux. 'at $B4 is present in Windows 1252 and
Linux Latin 1 codepages. Is there any reason to not use Unicode and
utf8 today on Windows and Linux?

String literals and comment fields excepted, there's not a lot of
reason to use UTF-8 in programming code.

Underscore in numbers is about convention. Several programming
languages have adopted it as a programmer convenience. It might
bemuse other languages to know Forth had no problem giving comma et
al new meanings but drew the line at underscore.

I really do like writing literals in sources in UTF-8, since my system
fully supports it and has not the faintest will to use antiquated or
strange things like CP1252, ISO-8859-xxx, UTF-16, but one gets quickly
used to writing sources in ASCII with hex escapes again when
collaborating with Windows people who are not willing or able to save
edited files as UTF-8 and all your special characters (for me,
especially measurement units containing characters like u+00B0
(Degrees), u+00B5 (greek mu for micro prefix), u+202F (narrow no-break
space between value and measurement unit) etc. are lost every time one
of these moron^H^H^H^H^Hfolks changed something.

Not wanting to contradict, but lots of Forth programs run on small systems where UTF-8 is not present, even when the programs are developped on feature-rich desktops.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Marcel Hendrix@21:1/5 to none albert on Tue Aug 23 10:11:53 2022

On Tuesday, August 23, 2022 at 12:23:22 PM UTC+2, none albert wrote:
[..]

123麓456麓789 ok.
. 123麓456麓789 ok

[..]

麓at $B4 is present in Windows 1252 and Linux Latin 1 codepages.
Is there any reason to not use Unicode and utf8 today on Windows and Linux?

According to the quoted stuff, quite a few :--)

-marcel

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to minf...@arcor.de on Wed Aug 24 07:38:19 2022

"minf...@arcor.de" <minforth@arcor.de> writes:

Not wanting to contradict, but lots of Forth programs run on small systems >where UTF-8 is not present, even when the programs are developped on >feature-rich desktops.

UTF-8-encoded strings are just sequences of bytes for nearly all the
code that deals with it. That's why it works so well for code that
has been written for ASCII; that's also just bytes. So a small system
has no problem dealing with UTF-8, and a statement like "UTF-8 is not
present" makes little sense.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: https://euro.theforth.net

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Bob Worm
  Mon May 6 11:44:29 2024
  from Wales, Uk via Telnet
- Bob Worm
  Tue May 7 15:23:21 2024
  from Wales, Uk via Telnet
- Bob Worm
  Tue May 7 14:12:05 2024
  from Wales, Uk via Telnet
- Bob Worm
  Tue May 7 09:06:52 2024
  from Wales, Uk via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	300
Nodes:	16 (2 / 14)
Uptime:	46:29:52
Calls:	6,710
Calls today:	3
Files:	12,243
Messages:	5,354,355
Posted today:	1

Underscoring numbers in Forth

Who's Online

Recent Visitors

System Info