Q. Won't the HOLD buffer need to be larger to hold the punctuation?
A. Assuming worst case and one underscore per 4 characters, 20% larger.
32/64-bit machines have increased the risk of entering numbers incorrectly. Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?
Usual suspects pre-answered.
Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number. It's increasingly used in programming languages for this purpose. Even
XPL0 has it.
Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.
Q. Has it broken anything?[..]
A. Not AFAIK
On Saturday, July 23, 2022 at 8:24:09 AM UTC+2, dxforth wrote:It could handle extra characters, e.g. a traditional use of
32/64-bit machines have increased the risk of entering numbers incorrectly. >> Should the Forth interpreter be allowed to ignore certain punctuation e.g. >> underscore in numbers? What would be the issues?[..]
Usual suspects pre-answered.
Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number. >> It's increasingly used in programming languages for this purpose. Even
XPL0 has it.
Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker. >>
Q. Has it broken anything?[..]
A. Not AFAIK
What exactly is your idea?
"... certain punctuation e.g. underscore ... "
I guess you are talking about integer single precision, i.e. you want
_1000, 1_000, 10_00, 100_0, 1000_, _1__0_0_0____ all to map to 1000
in the current BASE? This_is_dead_beef ?
When >NUMBER doesn't handle it, how does it get recognized as an
integer by the rest of the system? Why not have the application filter
it when it wants to support this?
NUMBER is carefully designed to be interruptable.
-marcel
32/64-bit machines have increased the risk of entering numbers incorrectly.
Should the Forth interpreter be allowed to ignore certain punctuation e.g. >underscore in numbers?
Usual suspects pre-answered.
Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number.
It's increasingly used in programming languages for this purpose. Even
XPL0 has it.
Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.
Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.
Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.
Q. Can't it be done using recognizers?
A. If so, probably at more cost.
Q. Will you keep it?
A. Good question. For 16-bit integers its value may be marginal. How often
do you enter values in binary?
What exactly is your idea?
"... certain punctuation e.g. underscore ... "
I guess you are talking about integer single precision, i.e. you want
_1000, 1_000, 10_00, 100_0, 1000_, _1__0_0_0____ all to map to 1000
in the current BASE? This_is_dead_beef ?
32/64-bit machines have increased the risk of entering numbers incorrectly. Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?
32/64-bit machines have increased the risk of entering numbers incorrectly.No such risk in case of underscore; to enter underscore character one has
Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?
to press Shift-Minus — it can be done only on purpose.
I believe one character that could be ignored the way you propose is space. When entering long numbers it may be comfortable to, for example, separate thousands by adding single space among them. It's easier to check the input before final Enter-press.
You are timidly entering the gritty realm of locales ... ;-)32/64-bit machines have increased the risk of entering numbers incorrectly.No such risk in case of underscore; to enter underscore character one has to press Shift-Minus — it can be done only on purpose.
Should the Forth interpreter be allowed to ignore certain punctuation e.g.
underscore in numbers? What would be the issues?
I believe one character that could be ignored the way you propose is space.
When entering long numbers it may be comfortable to, for example, separate thousands by adding single space among them. It's easier to check the input
before final Enter-press.
dxforth <dxforth@gmail.com> writes:
...
Q. Then you'll need a routine to strip the underscores and a temporary buffer >> to hold the result. What do you suggest?
A. The HOLD buffer.
No such buffer is needed. That's the beauty of >NUMBER, which has
been designed for a very similar use case:
: >number_ ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )
\ like >number, but ignores _
begin
>number
dup 0> while
over c@ '_' = while
1 /string
repeat then ;
Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.
Says who?
Q. Can't it be done using recognizers?
A. If so, probably at more cost.
What makes you think so?
It's the potential for underscored
numbers to collide with dictionary entries that's the problem.
32/64-bit machines have increased the risk of entering numbers incorrectly. Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?
On 23/07/2022 9:24, dxforth wrote:
32/64-bit machines have increased the risk of entering numbers incorrectly. >> Should the Forth interpreter be allowed to ignore certain punctuation e.g. >> underscore in numbers? What would be the issues?
I implemented underscores-in-numbers a while back in 8th, at no
perceivable cost. Makes large numbers much easier to understand.
On 24/07/2022 14:26, Ron AARON wrote:
On 23/07/2022 9:24, dxforth wrote:
32/64-bit machines have increased the risk of entering numbers incorrectly. >>> Should the Forth interpreter be allowed to ignore certain punctuation e.g. >>> underscore in numbers? What would be the issues?
I implemented underscores-in-numbers a while back in 8th, at no
perceivable cost. Makes large numbers much easier to understand.
What about dictionary collisions - or does 8th handle numbers differently? Any class of number or just integers?
I implemented underscores-in-numbers a while back in 8th, at no
perceivable cost. Makes large numbers much easier to understand.
What about dictionary collisions - or does 8th handle numbers differently? Any class of number or just integers?
On 23/07/2022 21:50, Anton Ertl wrote:
dxforth <dxforth@gmail.com> writes:
...
Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.
No such buffer is needed. That's the beauty of >NUMBER, which has
been designed for a very similar use case:
: >number_ ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )
\ like >number, but ignores _
begin
>number
dup 0> while
over c@ '_' = while
1 /string
repeat then ;
The idea was to avoid separate number converters.
Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.
Says who?
Humans - who use the same mouth to eat and speak.
Q. Can't it be done using recognizers?
A. If so, probably at more cost.
What makes you think so?
The 30 odd bytes I spent would be hard to beat.
It's the potential for underscored
numbers to collide with dictionary entries that's the problem.
200x character literals have the same issue
Once you have decided that numbers should have an ignoreable character
you might as well replace all occurrences of that literal by a variable. Once >you have a variable, you can now choose the ignoreable character at
run-time, e.g.
':' ign-char !
You can use a similar mechanism for the DP and FP separators.
Since we made this change there have been [...] no technical support issues.
32/64-bit machines have increased the risk of entering numbers incorrectly. >> Should the Forth interpreter be allowed to ignore certain punctuation e.g. >> underscore in numbers? What would be the issues?
No such risk in case of underscore; to enter underscore character one has
to press Shift-Minus — it can be done only on purpose.
I believe one character that could be ignored the way you propose is space. >When entering long numbers it may be comfortable to, for example, separate >thousands by adding single space among them. It's easier to check the input >before final Enter-press.
Using spaces in numbers? In Forth this is a bad idea.
Underscores, yes.
Who, apart of Forth programmer, will use underscore when entering
any number?
Who, apart of Forth programmer, will use underscore when enteringAccoding to <https://en.wikipedia.org/wiki/Decimal_separator#Digit_grouping>:
any number?
|maritime "21_450"
and (more relevant):
|Ada, C# (from version 7.0[34]), D, Haskell (from GHC version 8.6.1),
|Java, Kotlin,[35] OCaml, Perl, Python (from version 3.6), PHP (from |version 7.4[36]), Ruby, Go (from version 1.13), Rust, Julia, and
|Swift use the underscore (_) character for this purpose
Zbig <zbigniew2011@gmail.com> writes:
Who, apart of Forth programmer, will use underscore when entering
any number?
Accoding to
<https://en.wikipedia.org/wiki/Decimal_separator#Digit_grouping>:
|maritime "21_450"
and (more relevant):
|Ada, C# (from version 7.0[34]), D, Haskell (from GHC version 8.6.1),
|Java, Kotlin,[35] OCaml, Perl, Python (from version 3.6), PHP (from
|version 7.4[36]), Ruby, Go (from version 1.13), Rust, Julia, and
|Swift use the underscore (_) character for this purpose
- anton
dxforth <dxforth@gmail.com> writes:
On 23/07/2022 21:50, Anton Ertl wrote:
dxforth <dxforth@gmail.com> writes:
...
Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.
No such buffer is needed. That's the beauty of >NUMBER, which has
been designed for a very similar use case:
: >number_ ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )
\ like >number, but ignores _
begin
>number
dup 0> while
over c@ '_' = while
1 /string
repeat then ;
The idea was to avoid separate number converters.
I have no idea what you mean with that.
Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.
Says who?
Humans - who use the same mouth to eat and speak.
And the relevance to conversion from strings to numbers and numbers to strings is?
Q. Can't it be done using recognizers?
A. If so, probably at more cost.
What makes you think so?
The 30 odd bytes I spent would be hard to beat.
Moving the goalposts? I did not ask about beating.
Why makes you think that the cost would be higher rather than just the
same if one applies the same change to a pluggable number recognizer
rather than a hardwired one?
It's the potential for underscored
numbers to collide with dictionary entries that's the problem.
That's no problem, just like the potential for other numbers to
collide with dictionary entries is no problem:
Dictionary entries are searched first, so if you have a word _ or __
or _1 or 1_ etc., it will be found before the number recognizer tries
to convert it into a number. The conventional way to avoid a number
being shadowed by a dictionary entry is to start the number with one
of the digits 0-9 (and avoiding dictionary entries that start with
these digits, albeit there are some exceptions that prove this rule).
200x character literals have the same issue
It's the same non-issue for the same reason. And, guess what, no
problems have been reported to us, neither for the Forth-2012
character literals ('a', implemented in Gforth since 0.7 (2008)), nor
for Gforth's older syntax ('a, implemented in Gforth since the first
public release (1996)).
- anton
|Swift use the underscore (_) character for this purpose
Your example is especially nasty because ':' is a double indicator in SwiftForth. So someone following your suggestion would produce
programs that behave quite differently in SwiftForth.
Also bad ideas for language commonality. If you want to accept
decimal comma, accept it in addition to the decimal point. No need
for variables.
On 24/07/2022 19:35, Anton Ertl wrote:
dxforth <dxforth@gmail.com> writes:
The idea was to avoid separate number converters.
I have no idea what you mean with that.
You just created one. Will you create another for floats?
And the relevance to conversion from strings to numbers and numbers to
strings is?
I see no reason for them to collide.
Q. Can't it be done using recognizers?
A. If so, probably at more cost.
What makes you think so?
The 30 odd bytes I spent would be hard to beat.
Moving the goalposts? I did not ask about beating.
Why makes you think that the cost would be higher rather than just the
same if one applies the same change to a pluggable number recognizer
rather than a hardwired one?
Feel free to show the code for the plug-in.
On 24 Jul 2022 at 12:05:23 CEST, "Anton Ertl" <Anton Ertl> wrote:
Your response is a typical "not invented here" response.
The DP and FP character definitions solve a *real* issue in that the
Forth standard approach cannot be used for real-world data entry.
The DP and FP char solution allows the double and FP data entry
routines to be used for data entry in various locales.
Your example is especially nasty because ':' is a double indicator in
SwiftForth. So someone following your suggestion would produce
programs that behave quite differently in SwiftForth.
I have a dispute resolution protocol in a contract (yes, really) that includes >the line:
"Dispute resolution processes include the consumption of alcoholic
beverages, food and laughter."
Leon at Forth Inc and I are perfectly capable of finding a resolution.
Also bad ideas for language commonality. If you want to accept
decimal comma, accept it in addition to the decimal point. No need
for variables.
I think that you do not understand locales.
dxforth <dxforth@gmail.com> writes:
On 24/07/2022 19:35, Anton Ertl wrote:
dxforth <dxforth@gmail.com> writes:
The idea was to avoid separate number converters.
I have no idea what you mean with that.
You just created one. Will you create another for floats?
I have no such plans. But now I know what you mean.
And the relevance to conversion from strings to numbers and numbers to
strings is?
I see no reason for them to collide.
I do: Putting debugging output in string->number conversion words.
Why makes you think that the cost would be higher rather than just the
same if one applies the same change to a pluggable number recognizer
rather than a hardwired one?
Feel free to show the code for the plug-in.
This won't help at all, because it is not changed:
: rec-num ( addr u -- n/d table | notfound ) \ gforth-experimental
\G converts a number to a single/double integer
snumber? dup
IF
0> IF ['] recognized-dnum ELSE ['] recognized-num THEN EXIT
THEN
drop ['] notfound ;
The change is in s>unumber?, which is called (with one intermediate)
by snumber?. Both snumber? and s>unumber? already exist in
gforth-0.7, i.e., before recognizers. And the change consists of
replacing a call to >NUMBER with a call to >NUMBER_.
Wanting to handle all numbers, my hardwired solution was:
; strip ( c-addr u c-addr2 -- c-addr3 u3 )
hdr x,'STRIP',,1
strip: pop di
pop cx
pop bx
add bx,cx ; start at end
sub dx,dx
strip1: jcxz strip3
dec bx ; builds down
mov al,[bx]
cmp al,'_'
jz strip2
dec di
mov [di],al
inc dx
strip2: dec cx
jmp strip1
strip3: push di
push dx
nextt
|Swift use the underscore (_) character for this purpose
BTW: I think if „space” is too difficult to use it as „thousand >separator”,
ignored by Forth, I got a better „candidate”: Vertical Tab (0Bh):
— it's practically unused anywhere
— it could be entered with, say, Shift-Space
— it could be displayed as, guess what, just a single space
BTW: I think if „space” is too difficult to use it as „thousand >separator”,Terrible bad idea, because it can be visually discerned from a space.
ignored by Forth, I got a better „candidate”: Vertical Tab (0Bh):
— it's practically unused anywhere
— it could be entered with, say, Shift-Space
— it could be displayed as, guess what, just a single space
Tab is not a glyph, but a control of mechanical type writers.
BTW: I think if „space” is too difficult to use it as „thousandTerrible bad idea, because it can be visually discerned from a space.
separator”,
ignored by Forth, I got a better „candidate”: Vertical Tab (0Bh):
— it's practically unused anywhere
— it could be entered with, say, Shift-Space
— it could be displayed as, guess what, just a single space
Tab is not a glyph, but a control of mechanical type writers.
Mechanical type writers aren't used (since very long time) anymore,
so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.
I got a better „candidate”: Vertical Tab (0Bh):Terrible bad idea, because it can be visually discerned from a space.
— it's practically unused anywhere
— it could be entered with, say, Shift-Space
— it could be displayed as, guess what, just a single space
Tab is not a glyph, but a control of mechanical type writers.
Mechanical type writers aren't used (since very long time) anymore,True enough; but the point that you can't see it unless a special font
so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.
is used, is a valid one.
I got a better „candidate”: Vertical Tab (0Bh):Terrible bad idea, because it can be visually discerned from a space. >> Tab is not a glyph, but a control of mechanical type writers.
— it's practically unused anywhere
— it could be entered with, say, Shift-Space
— it could be displayed as, guess what, just a single space
Maybe my assumption is different, but actually I don't see any need to make it visible. I treat it as kind of „hard space” („non-breakable”) used sometimesMechanical type writers aren't used (since very long time) anymore,True enough; but the point that you can't see it unless a special font
so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.
is used, is a valid one.
in text editors. I see its supposed invisibility rather as advantage.
Of course if for some particular reasons that character should be visible, underscore
may be good enough.
Mechanical type writers aren't used (since very long time) anymore,BTW: I think if „space” is too difficult to use it as „thousand >separator”,Terrible bad idea, because it can be visually discerned from a space.
ignored by Forth, I got a better „candidate”: Vertical Tab (0Bh): >— it's practically unused anywhere
— it could be entered with, say, Shift-Space
— it could be displayed as, guess what, just a single space
Tab is not a glyph, but a control of mechanical type writers.
so VT can be „misused” for more practical things than controlling non-existant — and not available anymore — hardware.
The entire point of a thousands separator is to facilitate humans
reading large numbers or small fractions.
284 985 000 234,23
Like this:
284 985 000 234,23
Great. So how should a Forth text interpreter know that this is one
number, not four? And you should a human reading this as Forth code
know that?
Actually employing VT could have another advantage: consider all
these „hyphenated words”. They wouldn't have to be hyphenated
any longer. Instead of „pseudo space” VT could „link” two strings that comprise such word — making it look more natural.
: strip ( c-addr u -- c-addr2 u2 )
<# 2dup 1- over + do
i c@ [char] _ over - if hold else drop then
-1 +loop #> ;
Won't work on zero-length strings but irrelevant here.
Actually employing VT could have another advantage: consider all
these „hyphenated words”. They wouldn't have to be hyphenated
any longer. Instead of „pseudo space” VT could „link” two strings that comprise such word — making it look more natural.
Great. So how should a Forth text interpreter know that this is one=20
number, not four? And you should a human reading this as Forth code=20
know that?
That's why I proposed VT for that. The operator, by pressing Shift-Space >inserts VT between _groups_ of digits of the single number.
On the screen it looks like =E2=80=9Eordinary=E2=80=9D spaces =E2=80=94 exa= >ctly, like in case of
=E2=80=9Eordinary space=E2=80=9D and =E2=80=9Enon-breakable space=E2=80=9D = >(in case of text editor).
Actually employing VT could have another advantage: consider all
these =E2=80=9Ehyphenated words=E2=80=9D. They wouldn't have to be hyphena= >ted
any longer. Instead of =E2=80=9Epseudo space=E2=80=9D VT could =E2=80=9Elin= >k=E2=80=9D two strings
that comprise such word =E2=80=94 making it look more natural.
So you want to limit the ability to write Forth code to the use of special editors, custom designed for this Forth?
Why can't you see the issues this would cause???
There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.
001 002 003 004
Again, how should a human see the difference between
unused-words
and
unused words
if you replace the "-" by something that looks like a space?
Again, how should a human see the difference between
unused-words
and
unused words
if you replace the "-" by something that looks like a space?
Sometimes it may create a problem indeed, but taking a peek
into glossary usually should help.
So you want to limit the ability to write Forth code to the use of special editors, custom designed for this Forth?No.
Why can't you see the issues this would cause???What issues — in particular?
There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.
001 002 003 004It depends, whether the groups od digits are separated by space — or „connected” by VT.
There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.
That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!001 002 003 004It depends, whether the groups od digits are separated by space — or „connected” by VT.
Why can't you grasp this fail?
There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.
That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!001 002 003 004It depends, whether the groups od digits are separated by space — or „connected” by VT.
Why can't you grasp this fail?1. You wrote about text interpreter -- did you mean 'human' of Forth?
Forth won't have any problem, it'll find VT there.
2. If you mean human: if you want the others to understand you, you
have to be precise in your statements. So it's enough to separate two numbers with TWO (or more) spaces, while keeping the groups of digits „connected” still with SINGLE VT (shown as single space).
I honestly don't understand why are you put so much effort into creating problem out of nothing. You want to be properly understood? Be precise, that's all.
Ok, how many spaces did I type to separate these digits?
0123 4567 8901 2345
On Tuesday, July 26, 2022 at 4:51:25 PM UTC-4, minf...@arcor.de wrote:is not in use, so that would be good. I suppose if you were looking for coordinates in text, you could redefine ' for a bit, then restore it to mean "tick". Or do I not understand how numbers are read?
gnuarm.del...@gmail.com schrieb am Dienstag, 26. Juli 2022 um 22:06:34 UTC+2:
Ok, how many spaces did I type to separate these digits?
0123 4567 8901 2345At least there is a space between N and 7 in this geo coordinate: 38°17′10″N 76°24′42″W
Very helpful. ;o)So if you had a few spaces (not vertical tabs) in your coordinate, 38° 17′ 10″ N 76° 24′ 42″ W, I believe Forth would read the number 38, then treat ° as a word, no? I suppose ' would be a problem, since that is already in use. ", however,
There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.
That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!001 002 003 004It depends, whether the groups od digits are separated by space — or „connected” by VT.
Yes, but you then had to ask what I typed, showing the short coming, that a human can't tell. That was my point... unless you are not a human after all.Why can't you grasp this fail?1. You wrote about text interpreter -- did you mean 'human' of Forth? Forth won't have any problem, it'll find VT there.
2. If you mean human: if you want the others to understand you, youOk, how many spaces did I type to separate these digits?
have to be precise in your statements. So it's enough to separate two numbers with TWO (or more) spaces, while keeping the groups of digits „connected” still with SINGLE VT (shown as single space).
0123 4567 8901 2345
I honestly don't understand why are you put so much effort into creating problem out of nothing. You want to be properly understood? Be precise, that's all.Yes, you don't understand. That's the point.
gnuarm.del...@gmail.com schrieb am Dienstag, 26. Juli 2022 um 22:06:34 UTC+2:
Ok, how many spaces did I type to separate these digits?
0123 4567 8901 2345At least there is a space between N and 7 in this geo coordinate: 38°17′10″N 76°24′42″W
Very helpful. ;o)
Ok, how many spaces did I type to separate these digits?
0123 4567 8901 2345
There's still the problem of humans reading the code. Tell me how this will be interpreted by the text interpreter.
That's the point, innit? YOU CAN'T TELL WHEN READING IT!!!001 002 003 004It depends, whether the groups od digits are separated by space — or „connected” by VT.
If you write something like this: 001_002_003 004 -- I'll also have to ask youYes, but you then had to ask what I typed, showing the short coming, that a human can't tell. That was my point... unless you are not a human after all.Why can't you grasp this fail?1. You wrote about text interpreter -- did you mean 'human' of Forth? Forth won't have any problem, it'll find VT there.
a question, what actually you typed.
It doesn't depend on the selected separator character.
2. If you mean human: if you want the others to understand you, youOk, how many spaces did I type to separate these digits?
have to be precise in your statements. So it's enough to separate two numbers with TWO (or more) spaces, while keeping the groups of digits „connected” still with SINGLE VT (shown as single space).
0123 4567 8901 2345Maybe now it's the time for me to ask a question — you have already made
a fair use out of your question quota: does your Forth interpreter — and/or
your computer screen — „compress” spaces like Google News interface? Or it doesn't?
Never understood the people that insist on looking for the problems where there aren't any. I'm not a psychologist, you know, so I don't have to.I honestly don't understand why are you put so much effort into creating problem out of nothing. You want to be properly understood? Be precise, that's all.Yes, you don't understand. That's the point.
I honestly don't understand why are you put so much effort into creating problem out of nothing.
Great. So how should a Forth text interpreter know that this is one number, not four? And you should a human reading this as Forth code
know that?
That's why I proposed VT for that. The operator, by pressing Shift-Space inserts VT between _groups_ of digits of the single number.
On the screen it looks like „ordinary” spaces — exactly, like in case of
„ordinary space” and „non-breakable space” (in case of text editor).
(' ', 32)i=input(); i, ord(i)
Like this:
284 985 000 234,23
Like this:or '284 985 000 234.23' depending on locale?
284 985 000 234,23
'284_985_000_234,23' has fewer problems to resolve. Ugly as it might look, it is clearly one forth item.
Like this:or '284 985 000 234.23' depending on locale?
284 985 000 234,23
'284_985_000_234,23' has fewer problems to resolve. Ugly as it might look, it is clearly one forth item.I was trying to explain, that there are EXACTLY THE SAME „problems
to resolve” whether you connect the 3-digits groups with underscore,
or with VT — but in latter case it just... looks better.
32/64-bit machines have increased the risk of entering numbers incorrectly. Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?
Usual suspects pre-answered.
Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number. It's increasingly used in programming languages for this purpose. Even
XPL0 has it.
A. ANS didn't see the need for it.
Q. Are you married?
Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.
Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.
Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.
Q. Won't the HOLD buffer need to be larger to hold the punctuation?
A. Assuming worst case and one underscore per 4 characters, 20% larger.
Q. Is all this just c.l.f. speculation - or have you implemented it?
A. Implemented
Q. Has it broken anything?
A. Not AFAIK
Q. What did it cost?
A. 34 bytes on 8086, 39 bytes on 8080
Q. Can't it be done using recognizers?
A. If so, probably at more cost.
Q. Will you keep it?
A. Good question. For 16-bit integers its value may be marginal. How often do you enter values in binary?
On Saturday, 23 July 2022 at 08:24:09 UTC+2, dxforth wrote:
32/64-bit machines have increased the risk of entering numbers incorrectly.
Should the Forth interpreter be allowed to ignore certain punctuation e.g. underscore in numbers? What would be the issues?
Usual suspects pre-answered.
Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number.
It's increasingly used in programming languages for this purpose. Even XPL0 has it.
A. ANS didn't see the need for it.
Q. Are you married?
Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.
Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.
Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.
Q. Won't the HOLD buffer need to be larger to hold the punctuation?
A. Assuming worst case and one underscore per 4 characters, 20% larger.
Q. Is all this just c.l.f. speculation - or have you implemented it?
A. Implemented
Q. Has it broken anything?
A. Not AFAIK
Q. What did it cost?
A. 34 bytes on 8086, 39 bytes on 8080
Q. Can't it be done using recognizers?
A. If so, probably at more cost.
Q. Will you keep it?
A. Good question. For 16-bit integers its value may be marginal. How often do you enter values in binary?
I got interested in this suggestion and implemented it.
I thought the underscore was a bit ugly so implemented a word to set the grouping char
: SET-GROUPING-CHAR ( xchar --)
0 grping !
dup 32 > and grping xc!+ drop ;
I also set the grouping different based on BASE.
Decimal and octal group 3 digits
Hex 4 and binary 8.
After that I started testing different chars. Today I use ´ ( $B4 acute accent)
I think that ties the numbers together while _ puts them apart
123´456´789 ok.
. 123´456´789 ok
'_' set-grouping-char ok
123_456_789 ok.
. 123_456_789 ok
I also tried out the space as suggested by Zbig but not using VT.
At codepoint $A0 there is a non breaking space char
$a0 set-grouping-char ok
123456789 ok.
. 123 456 789 ok
it gets more difficult to input without remapping a key.
´ is nice as it is (on my Swedish keyboard) next to the + key on the top row
no shift or alt key needed to input it.
But using the non breaking space I can now make words with spaces in them!
: Hej Peter ." Ciao Peter" ; ok
Hej Peter Ciao Peter ok
This of course looks even more confusing then spaces in numbers!
For me this improves readability enormously! Thanks for the suggestion.
: Hej Peter ." Ciao Peter" ; ok
Hej Peter Ciao Peter ok
This of course looks even more confusing then spaces in numbers!
P Falth schrieb am Montag, 22. August 2022 um 15:59:54 UTC+2:
On Saturday, 23 July 2022 at 08:24:09 UTC+2, dxforth wrote:
32/64-bit machines have increased the risk of entering numbers incorrectly.
Should the Forth interpreter be allowed to ignore certain punctuation e.g.
underscore in numbers? What would be the issues?
Usual suspects pre-answered.
Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number.
It's increasingly used in programming languages for this purpose. Even XPL0 has it.
A. ANS didn't see the need for it.
Q. Are you married?
Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.
Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.
Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.
Q. Won't the HOLD buffer need to be larger to hold the punctuation?
A. Assuming worst case and one underscore per 4 characters, 20% larger.
Q. Is all this just c.l.f. speculation - or have you implemented it?
A. Implemented
Q. Has it broken anything?
A. Not AFAIK
Q. What did it cost?
A. 34 bytes on 8086, 39 bytes on 8080
Q. Can't it be done using recognizers?
A. If so, probably at more cost.
Q. Will you keep it?
A. Good question. For 16-bit integers its value may be marginal. How often
do you enter values in binary?
I got interested in this suggestion and implemented it.
I thought the underscore was a bit ugly so implemented a word to set the grouping char
: SET-GROUPING-CHAR ( xchar --)
0 grping !
dup 32 > and grping xc!+ drop ;
I also set the grouping different based on BASE.
Decimal and octal group 3 digits
Hex 4 and binary 8.
After that I started testing different chars. Today I use ´ ( $B4 acute accent)
I think that ties the numbers together while _ puts them apart
123´456´789 ok.
. 123´456´789 ok
'_' set-grouping-char ok
123_456_789 ok.
. 123_456_789 ok
I also tried out the space as suggested by Zbig but not using VT.
At codepoint $A0 there is a non breaking space char
$a0 set-grouping-char ok
123456789 ok.
. 123 456 789 ok
it gets more difficult to input without remapping a key.
´ is nice as it is (on my Swedish keyboard) next to the + key on the top row
no shift or alt key needed to input it.
But using the non breaking space I can now make words with spaces in them!
: Hej Peter ." Ciao Peter" ; ok
Hej Peter Ciao Peter ok
This of course looks even more confusing then spaces in numbers!
For me this improves readability enormously! Thanks for the suggestion.
Fine! I am just wondering if ´ ie $B4 is the same in most codepages/locales.
...
My systems require input to be utf8 encoded Unicode and will output utf8 streams.
It has worked for over 20 years like that on both Windows and Linux.
´at $B4 is present in Windows 1252 and Linux Latin 1 codepages.
Is there any reason to not use Unicode and utf8 today on Windows and Linux?
On Monday, 22 August 2022 at 16:44:02 UTC+2, minf...@arcor.de wrote:
P Falth schrieb am Montag, 22. August 2022 um 15:59:54 UTC+2:
On Saturday, 23 July 2022 at 08:24:09 UTC+2, dxforth wrote:Fine! I am just wondering if ´ ie $B4 is the same in most codepages/locales.
32/64-bit machines have increased the risk of entering numbers incorrectly.
Should the Forth interpreter be allowed to ignore certain punctuation e.g.
underscore in numbers? What would be the issues?
Usual suspects pre-answered.
Q. Why the underscore character?
A. It's not one of the characters Forth Inc uses to denote a double number.
It's increasingly used in programming languages for this purpose. Even >> > > XPL0 has it.
A. ANS didn't see the need for it.
Q. Are you married?
Q. Should >NUMBER process the underscore?
A. No - for the same reason SCAN shouldn't handle TABs - it makes it weaker.
Q. Then you'll need a routine to strip the underscores and a temporary buffer
to hold the result. What do you suggest?
A. The HOLD buffer.
Q. Won't it interfere with numeric output?
A. Input/output are usually mutually exclusive.
Q. Won't the HOLD buffer need to be larger to hold the punctuation?
A. Assuming worst case and one underscore per 4 characters, 20% larger. >> > >
Q. Is all this just c.l.f. speculation - or have you implemented it?
A. Implemented
Q. Has it broken anything?
A. Not AFAIK
Q. What did it cost?
A. 34 bytes on 8086, 39 bytes on 8080
Q. Can't it be done using recognizers?
A. If so, probably at more cost.
Q. Will you keep it?
A. Good question. For 16-bit integers its value may be marginal. How often
do you enter values in binary?
I got interested in this suggestion and implemented it.
I thought the underscore was a bit ugly so implemented a word to set the grouping char
: SET-GROUPING-CHAR ( xchar --)
0 grping !
dup 32 > and grping xc!+ drop ;
I also set the grouping different based on BASE.
Decimal and octal group 3 digits
Hex 4 and binary 8.
After that I started testing different chars. Today I use ´ ( $B4 acute accent)
I think that ties the numbers together while _ puts them apart
123´456´789 ok.
. 123´456´789 ok
'_' set-grouping-char ok
123_456_789 ok.
. 123_456_789 ok
I also tried out the space as suggested by Zbig but not using VT.
At codepoint $A0 there is a non breaking space char
$a0 set-grouping-char ok
123456789 ok.
. 123 456 789 ok
it gets more difficult to input without remapping a key.
´ is nice as it is (on my Swedish keyboard) next to the + key on the top row
no shift or alt key needed to input it.
But using the non breaking space I can now make words with spaces in them! >> >
: Hej Peter ." Ciao Peter" ; ok
Hej Peter Ciao Peter ok
This of course looks even more confusing then spaces in numbers!
For me this improves readability enormously! Thanks for the suggestion.
My systems require input to be utf8 encoded Unicode and will output utf8 streams.
It has worked for over 20 years like that on both Windows and Linux.
´at $B4 is present in Windows 1252 and Linux Latin 1 codepages.
Is there any reason to not use Unicode and utf8 today on Windows and Linux?
There is a good reason to junk { BL WORD } in favor of TOKEN / NAME or whatever.
NAME ( -- addr n ) get a blank surrounded token from the input stream
with appropriate side effects on the input stream.
Then encoding of the characters shouldn't be a concern of the Forth system.
On 23/08/2022 06:50, P Falth wrote:
... My systems require input to be utf8 encoded Unicode and will
output utf8 streams. It has worked for over 20 years like that on
both Windows and Linux. 'at $B4 is present in Windows 1252 and
Linux Latin 1 codepages. Is there any reason to not use Unicode and
utf8 today on Windows and Linux?
String literals and comment fields excepted, there's not a lot of
reason to use UTF-8 in programming code.
Underscore in numbers is about convention. Several programming
languages have adopted it as a programmer convenience. It might
bemuse other languages to know Forth had no problem giving comma et
al new meanings but drew the line at underscore.
On 23.08.2022 05:21, dxforth wrote:
On 23/08/2022 06:50, P Falth wrote:
... My systems require input to be utf8 encoded Unicode and will
output utf8 streams. It has worked for over 20 years like that on
both Windows and Linux. 'at $B4 is present in Windows 1252 and
Linux Latin 1 codepages. Is there any reason to not use Unicode and
utf8 today on Windows and Linux?
String literals and comment fields excepted, there's not a lot of
reason to use UTF-8 in programming code.
Underscore in numbers is about convention. Several programming
languages have adopted it as a programmer convenience. It might
bemuse other languages to know Forth had no problem giving comma et
al new meanings but drew the line at underscore.
I really do like writing literals in sources in UTF-8, since my system
fully supports it and has not the faintest will to use antiquated or
strange things like CP1252, ISO-8859-xxx, UTF-16, but one gets quickly
used to writing sources in ASCII with hex escapes again when
collaborating with Windows people who are not willing or able to save
edited files as UTF-8 and all your special characters (for me,
especially measurement units containing characters like u+00B0
(Degrees), u+00B5 (greek mu for micro prefix), u+202F (narrow no-break
space between value and measurement unit) etc. are lost every time one
of these moron^H^H^H^H^Hfolks changed something.
[..]123麓456麓789 ok.
. 123麓456麓789 ok
麓at $B4 is present in Windows 1252 and Linux Latin 1 codepages.
Is there any reason to not use Unicode and utf8 today on Windows and Linux?
Not wanting to contradict, but lots of Forth programs run on small systems >where UTF-8 is not present, even when the programs are developped on >feature-rich desktops.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 300 |
Nodes: | 16 (2 / 14) |
Uptime: | 46:29:52 |
Calls: | 6,710 |
Calls today: | 3 |
Files: | 12,243 |
Messages: | 5,354,355 |
Posted today: | 1 |