• Re: Hebrew: General question about location of shortcut indicator "&" a

    From c.buhtz@posteo.jp@21:1/5 to All on Mon Feb 5 12:10:01 2024
    Dear Stephen,

    thanks a lot. Your explanation helped me a bit.

    How is this represented as bytes on the data disc?

    As an simple example lets assume "ABC" is a word in Left-to-Right.
    Making it a Hebrew word (e.g. via translation) it would be written "CBA" because its read from Right-to-left, starting with "A", then "B" and "C"
    at the end.

    Am I right so far?

    No lets add such a shortcut indicator to the first letter (the "A").
    Weblate and Qt seems to use the correct BIDI algorithm and will display
    it correctly like this:

    "CB&A" (or an underlined "A" in a Qt GUI)

    But a terminal without using the correct BIDI algorithm shows it like
    this:

    "&CBA"

    I am aware that a unicode character consist of multiple bytes. Usually
    it starts with 2 bytes and then there can come additional characters to
    it. I remember the emoticon example of an black astronaut:
    human+rocket+black (or something like this).
    But please lets keep it simple and don't open the unicode box to much. I
    assume there is a hidden control character indicating the read
    direction?
    So what is in the file?

    &ABC

    or

    &CBA

    I do guess it is the first (&ABC), right? It is coded into unicode that
    the A the B and the C need to be read the "other way around"?
    So the IO algorithm read something like this?

    & reverted-A reverted-B reverted-C

    Or even in Python:

    myletters = ['&', 'A', 'B', 'C']

    # but myleters[1:] are somehow coded as "other way around"

    OK?

    Kind
    Christian

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)