• [ANN] UXStrings package available (UXS_20220226).

    From Blady@21:1/5 to All on Tue Mar 1 21:47:49 2022
    Hello,

    The objective of UXStrings is Unicode and dynamic length support for
    strings in Ada.

    UXStrings API is inspired from Ada.Strings.Unbounded in order to
    minimize adaptation work from existing Ada source codes.

    Changes from last publication:
    - Ada.Strings.UTF_Encoding.Conversions fix is no longer needed with GNAT
    CE 2021
    - A few fix

    Available on GitHub (https://github.com/Blady-Com/UXStrings) and also on
    Alire (https://alire.ada.dev/crates/uxstrings.html).

    Feedback is welcome on actual use cases.

    Regards, Pascal.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Vincent D.@21:1/5 to All on Wed Mar 23 02:42:40 2022
    Le mardi 1 mars 2022 à 21:47:51 UTC+1, Blady a écrit :
    Feedback is welcome on actual use cases.

    Hello Pascal,
    Thank you very much for this great improvement over Unbounded Strings !
    Sure a short string optimization, such a the one implemented in GNATColl.XStrings, would be appreciated.
    As a personnal taste, I would appreciate to have a UXCharacter type that is a Wide_Wide_Character, and an ASCII_Character, or a Char that is a subtype of it.
    I think that the ASCII_String could be a derived type of UXString since it is a proper subtype, that specializes the UXString to only ASCII Characters. Some primitive operations can then be overriden to take advantage of the direct mapping between bytes
    and characters.
    Regards,
    Vincent

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Blady@21:1/5 to All on Thu Mar 24 22:13:31 2022
    Le 23/03/2022 à 10:42, Vincent D. a écrit :
    Le mardi 1 mars 2022 à 21:47:51 UTC+1, Blady a écrit :
    Feedback is welcome on actual use cases.

    Hello Pascal,
    Thank you very much for this great improvement over Unbounded Strings !
    Sure a short string optimization, such a the one implemented in GNATColl.XStrings, would be appreciated.
    As a personnal taste, I would appreciate to have a UXCharacter type that is a Wide_Wide_Character, and an ASCII_Character, or a Char that is a subtype of it.
    I think that the ASCII_String could be a derived type of UXString since it is a proper subtype, that specializes the UXString to only ASCII Characters. Some primitive operations can then be overriden to take advantage of the direct mapping between
    bytes and characters.

    Hello Vincent,

    I had some thoughts about "generic" Character concept and felt the user
    would mostly choose Unicode representation (same as Wide_Wide_Character)
    which brings the maximum available character set at a small cost of 4 bytes. Ada standard library comes will all sort of conversion subprograms to
    ASCII, Latin-1...
    Thus, for UXStrings, I choose Unicode_Character type as "generic"
    character (which renames Wide_Wide_Character), see for instance: https://github.com/Blady-Com/UXStrings/blob/master/src/uxstrings1.ads#L58

    May you be more specific?
    What advantages for the user would bring a UXCharacter type?

    egards, Pascal.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Vincent D.@21:1/5 to All on Wed Mar 30 06:02:55 2022
    Le jeudi 24 mars 2022 à 22:13:40 UTC+1, Blady a écrit :
    Thus, for UXStrings, I choose Unicode_Character type as "generic"
    character (which renames Wide_Wide_Character), see for instance: https://github.com/Blady-Com/UXStrings/blob/master/src/uxstrings1.ads#L58

    May you be more specific?
    What advantages for the user would bring a UXCharacter type?

    Hello Pascal,
    We agree that a type for Unicode Code Point is mandatory. I find the name Wide_Wide_Character clumsy, and I would appreciate to have shorter names so as a personal taste I would simply prefer "Unicode" to "Unicode_Character".
    Then I realize - and hence I contredict my own previous post - that the important concept for the user is the Grapheme cluster . So in fact a UXCharacter should simply be a subtype of UXString storing one Grapheme Cluster.
    Just my 2 cents.
    Regards,

    Vincent

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon Wright@21:1/5 to Vincent D. on Wed Mar 30 20:53:03 2022
    "Vincent D." <vincent.diemunsch@gmail.com> writes:

    Then I realize - and hence I contredict my own previous post - that
    the important concept for the user is the Grapheme cluster . So in
    fact a UXCharacter should simply be a subtype of UXString storing one Grapheme Cluster.

    Personally I like the semantic - I know this[1] is a macOS problem, but
    it comes to something when you get a warning like

    páck3.ads:1:10: warning: file name does not match unit name,
    should be "páck3.ads"

    [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81114#c1

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From G.B.@21:1/5 to Simon Wright on Thu Mar 31 15:51:35 2022
    On 30.03.22 21:53, Simon Wright wrote:

    Personally I like the semantic - I know this[1] is a macOS problem, but
    it comes to something when you get a warning like

    páck3.ads:1:10: warning: file name does not match unit name,
    should be "páck3.ads"

    [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81114#c1

    Slightly OT, varying the PR:

    "Right. And people should use sane floating point types
    (and sane CPUs to begin with)."

    7-bit people have created much work in the past,
    by insisting that human beings should succumb to ASCII.
    Learn how to type. Start at LE vs BE if flexibility vexes you.
    Even C was ahead of its time by defining plain char.
    See gcc's -funsigned-char vs libraries.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)