set sz="ß",SZ="ẞ" write $ascii(sz)," ",$ascii(SZ)223 7838
set umch="äëïöüÿÄËÏÖÜŸ" for i=1:1:$length(umch) write $ascii($extract(umch,i))," "228 235 239 246 252 255 196 203 207 214 220 376
write "Öhman"]"Pfaff"," ","Ohman"]"Pfaff"1 0
write "Öhman"]]"Pfaff"," ","Ohman"]]"Pfaff"1 0
I don't have any code for this "on the shelf", but I'd start by going through the strings, and replacing all the compound characters with their components, i.e. translate "ä" into "ae", "ß" into "sz", etc., and then comparing them in the "old-fashioned" way.
(which would work for most cases, my German isn't too good, but I am aware that "ß" sometimes should become "sz" and sometimes "ss"...)
Hope this works as a starting point,
Ed
I'm german, but I wasn't sure about the correct sort-order.
It seems that there are two options:
1. ä=a, ö=o, ü=u, ß=ss ------ Example: Bäcker->Bader->Bäder->Busse->Buße
2. ä=ae,ö=oe,ü=ue, ß=ss ---- Example: Bader->Bäcker->Bäder->Busse->Buße
Both versions are used in some cases. MS Word uses option 1, Phonebooks are sorted like option 2.
Hope, this helps.
Jens
On Tuesday, November 23, 2021 at 8:13:10 AM UTC-5, Jens wrote:I just looked into a German/English dictionary and this is sorted like option 1
I'm german, but I wasn't sure about the correct sort-order.
It seems that there are two options:
1. ä=a, ö=o, ü=u, ß=ss ------ Example: Bäcker->Bader->Bäder->Busse->Buße
2. ä=ae,ö=oe,ü=ue, ß=ss ---- Example: Bader->Bäcker->Bäder->Busse->Buße
Both versions are used in some cases. MS Word uses option 1, Phonebooks are sorted like option 2.
Hope, this helps.
JensThanks Jens. I was trying to help a German friend who uses YottaDB to index some text as a personal, post-retirement project. It seems that things are not as simple as I thought! Out of curiosity, what sorting order do dictionaries use?
Regards
– Bhaskar
K.S. Bhaskar schrieb am Dienstag, 23. November 2021 um 16:16:16 UTC+1:
On Tuesday, November 23, 2021 at 8:13:10 AM UTC-5, Jens wrote:
I'm german, but I wasn't sure about the correct sort-order.
It seems that there are two options:
1. ä=a, ö=o, ü=u, ß=ss ------ Example: Bäcker->Bader->Bäder->Busse->Buße
2. ä=ae,ö=oe,ü=ue, ß=ss ---- Example: Bader->Bäcker->Bäder->Busse->Buße
Both versions are used in some cases. MS Word uses option 1, Phonebooks are sorted like option 2.
Hope, this helps.
JensThanks Jens. I was trying to help a German friend who uses YottaDB to index some text as a personal, post-retirement project. It seems that things are not as simple as I thought! Out of curiosity, what sorting order do dictionaries use?
RegardsI just looked into a German/English dictionary and this is sorted like option 1
– Bhaskar
Regards Jens
PS: if I can help your friend in any way, I would do so. I still like coding in M
PSS: Just working on the Visual Studio Code extension to check correct NEWing of M local variables. :-)
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 293 |
Nodes: | 16 (2 / 14) |
Uptime: | 241:13:14 |
Calls: | 6,624 |
Files: | 12,173 |
Messages: | 5,320,138 |