Re: Hebrew: General question about location of shortcut indicator "&" a
From
c.buhtz@posteo.jp@21:1/5 to
All on Mon Feb 5 12:10:01 2024
Dear Stephen,
thanks a lot. Your explanation helped me a bit.
How is this represented as bytes on the data disc?
As an simple example lets assume "ABC" is a word in Left-to-Right.
Making it a Hebrew word (e.g. via translation) it would be written "CBA" because its read from Right-to-left, starting with "A", then "B" and "C"
at the end.
Am I right so far?
No lets add such a shortcut indicator to the first letter (the "A").
Weblate and Qt seems to use the correct BIDI algorithm and will display
it correctly like this:
"CB&A" (or an underlined "A" in a Qt GUI)
But a terminal without using the correct BIDI algorithm shows it like
this:
"&CBA"
I am aware that a unicode character consist of multiple bytes. Usually
it starts with 2 bytes and then there can come additional characters to
it. I remember the emoticon example of an black astronaut:
human+rocket+black (or something like this).
But please lets keep it simple and don't open the unicode box to much. I
assume there is a hidden control character indicating the read
direction?
So what is in the file?
&ABC
or
&CBA
I do guess it is the first (&ABC), right? It is coded into unicode that
the A the B and the C need to be read the "other way around"?
So the IO algorithm read something like this?
& reverted-A reverted-B reverted-C
Or even in Python:
myletters = ['&', 'A', 'B', 'C']
# but myleters[1:] are somehow coded as "other way around"
OK?
Kind
Christian
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)