[...] moreover in the case of strings accentuated in French and strings containing emojis the process times are also improved (factor 7 to 8 by compared to UXStrings1- I find quite astonishing to have a factor 8 compared to UTF-8 encoding. Do you have an explanation ? This looks like a poor implementation because UTF-8 encoding is fast and allows direct manipulation in most cases. Maybe because random access is
Hello Pascal,
Thank you for this contribution. Here are some comments:
- since UTFString is a class ("a tagged record type"), why don't you create an abstract root "UXString" and then derive specialized object types ? Like UTF_8_XString, UTF_16_XString, ASCII_XString, Win_1252_XString, Latin_XString, etc.
- The default format to convert between different encodings should be UTF-8 as it is now ubiquitous.
treated as sequential access for UTF-8 encoded strings but this again is poor implementation.[...] moreover in the case of strings accentuated in French and strings containing emojis the process times are also improved (factor 7 to 8 by compared to UXStrings1- I find quite astonishing to have a factor 8 compared to UTF-8 encoding. Do you have an explanation ? This looks like a poor implementation because UTF-8 encoding is fast and allows direct manipulation in most cases. Maybe because random access is
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 343 |
Nodes: | 16 (2 / 14) |
Uptime: | 24:41:50 |
Calls: | 7,553 |
Files: | 12,733 |
Messages: | 5,654,761 |