DIGIT ( char base -- u true | char false ) A
DIGIT ( char base -- u true | false )[...]
DIGIT ( char base -- u true | char false )
I have not
found Gforth's DIGIT? (like DIGIT) restrictive yet.
I thought I would try and see if its really a PITA ;
...
I thought I would try and see if its really a PITA ;
attempt 1
: adjch ( ch -- n T | ch F )
dup '0' '9' 1+ within if '0' - TRUE exit then
dup 'A' 'Z' 1+ within if 'A' - 10 + TRUE exit then
dup 'a' 'z' 1+ within if 'a' - 10 + TRUE exit then
FALSE ;
: >dgt ( ch base -- n T | ch F )
over adjch if dup >r
0 rot within if drop r> true else r> drop False then
else drop drop FALSE then ;
On 19/06/2022 09:57, NN wrote:
I thought I would try and see if its really a PITA ;
...
What did you conclude?
If the standard hadn't mandated base to be 2..36 , we could
have gone all the way up to base 62. ( treat 'A' and 'a' as
different )
NN <novembe...@gmail.com> writes:
I thought I would try and see if its really a PITA ;
attempt 1
: adjch ( ch -- n T | ch F )This is pretty good for BASE-36 conversion, if you do it the branchy
dup '0' '9' 1+ within if '0' - TRUE exit then
dup 'A' 'Z' 1+ within if 'A' - 10 + TRUE exit then
dup 'a' 'z' 1+ within if 'a' - 10 + TRUE exit then
FALSE ;
way. Only three branches. I thought about using binary search, but
it's not necessarily better.
: >dgt ( ch base -- n T | ch F )This is ugly, though.
over adjch if dup >r
0 rot within if drop r> true else r> drop False then
else drop drop FALSE then ;
One thing to note is that we don't need the upper bounds of ADJCH,
because that is handled in >DGT; let's see what we can do with that:
: adjch1 ( ch -- n1 | -n2 )
dup '0' '9' 1+ within 'A' '0' - and
over 'A' u>= 10 and +
over 'a' u>= 'A' 'a' - and + + 'A' - ;
This is somewhat intricate. The first and the third 'A' need to be
the same value and at least as large as 'A' (and not too large). It
results in all non-digits below 'A' being negative. The cool thing
about this implementation is that it is branchless and does not need a
table in memory. On VFX 4.72 it takes 51 bytes, 20 instructions, on
VFX64 66 bytes and 20 instructions.
Let's see what we can do about >DGT:
: >DGT ( ch base -- n true | ch false )
over adjch1 tuck u> dup >r 0= select r> ;
Where SELECT can be defined as
: select ( u1 u2 f -- u )
\ ""If @i{f} is false, @i{u} is @i{u2}, otherwise @i{u1}.""
IF swap THEN nip ;
Or it can be implemented with a conditional move instruction.
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: http://www.euroforth.org/ef22/cfp.html
...: adjch ( ch -- n T | ch F )
dup '0' '9' 1+ within if '0' - TRUE exit then
dup 'A' 'Z' 1+ within if 'A' - 10 + TRUE exit then
dup 'a' 'z' 1+ within if 'a' - 10 + TRUE exit then
FALSE ;
...: adjch1 ( ch -- n1 | -n2 )
dup '0' '9' 1+ within 'A' '0' - and
over 'A' u>= 10 and +
over 'a' u>= 'A' 'a' - and + + 'A' - ;
I like your implementation of adjch1. Its what I tried to do and failed, which is
why I ended up using the 3 withins which I wanted to avoid.
Reminded me of the old saying about writing dumb code vs clever code.
I wrote dumb code but clever code is better in this example.
NN <november.nihal@gmail.com> writes:
If the standard hadn't mandated base to be 2..36 , we could
have gone all the way up to base 62. ( treat 'A' and 'a' as
different )
The standard does not guarantee that BASE>36 works, but systems are
free to support more. In Gforth you can sensibly use bases up to 42:
#42 base ! ok
#36 . [ ok
#37 . \ ok
#38 . ] ok
#39 . ^ ok
#40 . _ ok
#41 . ` ok
I just tried
#42 base !
#41 .
0` .
on iForth, lxf, sf, and vfx4, and they all output "`" for the second
and third lines, so it seems they can all manage base 42.
The next character after "`" is "a", so to support larger bases, a
more complicated implementation that skips a..z would be necessary on >case-insensitive Forth systems. The benefit is miniscule, so I doubt
that anybody went there.
- anton
In article <2022Jun1...@mips.complang.tuwien.ac.at>,[..]
Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
NN <novembe...@gmail.com> writes:
If the standard hadn't mandated base to be 2..36 , we could
have gone all the way up to base 62. ( treat 'A' and 'a' as
different )
The standard does not guarantee that BASE>36 works, but systems are
free to support more. In Gforth you can sensibly use bases up to 42:
#42 base ! ok
#36 . [ ok
#37 . \ ok
ciforth went there. As soon as the base is not decimal, the exponent
sign becomes _.
S[ ] OK DECIMAL 1E1 FS.
9.999999999999999999E0
S[ ] OK HEX 1_1 FS.
On Sunday, June 19, 2022 at 8:59:52 PM UTC+2, none albert wrote:
In article <2022Jun1...@mips.complang.tuwien.ac.at>,[..]
Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
NN <novembe...@gmail.com> writes:
If the standard hadn't mandated base to be 2..36 , we could
have gone all the way up to base 62. ( treat 'A' and 'a' as
different )
The standard does not guarantee that BASE>36 works, but systems are
free to support more. In Gforth you can sensibly use bases up to 42:
#42 base ! ok
#36 . [ ok
#37 . \ ok
ciforth went there. As soon as the base is not decimal, the exponent
sign becomes _.
S[ ] OK DECIMAL 1E1 FS.
9.999999999999999999E0
S[ ] OK HEX 1_1 FS.
iForth:
FORTH> 72 base ! ok
FORTH> A . A ok
FORTH> X . X ok
FORTH> Z . Z ok
FORTH> a . a ok
FORTH> z . z ok
FORTH> z decimal . 67 ok
-marcel
On Sunday, 19 June 2022 at 03:25:58 UTC+1, dxforth wrote:
On 19/06/2022 09:57, NN wrote:
I thought I would try and see if its really a PITA ;
...
What did you conclude?
If the standard hadn't mandated base to be 2..36 , we could
have gone all the way up to base 62. ( treat 'A' and 'a' as
different )
ciforth went there. As soon as the base is not decimal, the exponent...
sign becomes _.
The _ is algol68 compatible, going as far as 5F.
Hex numbers can use base 0x40
this is useful, because
fp numbers can be represented exactly.
If you want easy conversion to binary FP numbers, binary, octal, hex
or base 32 will do, no need for base 64. The way other languages
(apparently starting with C99) seem to have standardized on is hex
mantissa digits, "p" (instead of "e") to start the exponent, and
decimal exponent digits.
On Monday, June 20, 2022 at 10:06:38 AM UTC+2, Anton Ertl wrote:
[..]
If you want easy conversion to binary FP numbers, binary, octal, hex
or base 32 will do, no need for base 64. The way other languages
(apparently starting with C99) seem to have standardized on is hex
mantissa digits, "p" (instead of "e") to start the exponent, and
decimal exponent digits.
"Convert among IEEE 754 32-bit or 64-bit float, C99 mixed hex/decimal
string, and raw hex string formats" published by David N. Williams higher
up in this newsgroup.
albert@cherry.(none) (albert) writes:
ciforth went there. As soon as the base is not decimal, the exponent...
sign becomes _.
The _ is algol68 compatible, going as far as 5F.
That's unfortunate, because _ is a common digit group separator, not
just in Ada, C#, D, Haskell, Java, Kotlin, OCaml, Perl, Python, PHP,
Ruby, Go, Rust, Julia, and Swift
<https://en.wikipedia.org/wiki/Decimal_separator#Data_versus_mask>,
but also outside computing:
<https://en.wikipedia.org/wiki/Decimal_separator#Digit_grouping>:
|For ease of reading, numbers with many digits may be divided into
|groups using a delimiter, such as [...] underbar "_" (as in maritime >|"21_450") [...]
Base 0x40 is useful, because
fp numbers can be represented exactly.
If you want easy conversion to binary FP numbers, binary, octal, hex
or base 32 will do, no need for base 64. The way other languages
(apparently starting with C99) seem to have standardized on is hex
mantissa digits, "p" (instead of "e") to start the exponent, and
decimal exponent digits.
- anton--
~$ echo aap=1234_56E1 | python
File "<stdin>", line 1
aap=1234_56E1
^
SyntaxError: invalid syntax
In article <2022Jun20.093304@mips.complang.tuwien.ac.at>,
Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
albert@cherry.(none) (albert) writes:
ciforth went there. As soon as the base is not decimal, the exponent...
sign becomes _.
The _ is algol68 compatible, going as far as 5F.
That's unfortunate, because _ is a common digit group separator, not
just in Ada, C#, D, Haskell, Java, Kotlin, OCaml, Perl, Python, PHP,
Ruby, Go, Rust, Julia, and Swift
Python?
~$ echo aap=123456E1 | python
~$ echo aap=1234_56E1 | python
File "<stdin>", line 1
aap=1234_56E1
^
SyntaxError: invalid syntax
<https://en.wikipedia.org/wiki/Decimal_separator#Data_versus_mask>,
For ease of reading the space
suffice largely. It can be used this way in Algol68, because white
space has no meaning in Algol68, except in strings.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 300 |
Nodes: | 16 (2 / 14) |
Uptime: | 35:59:07 |
Calls: | 6,707 |
Files: | 12,239 |
Messages: | 5,353,431 |