How likely is it that that there's a faster way than doing just a binary lookup, i.e. something like the current
mov edx, in_plate
bswap edx
mov esi, 1
mov edi, type plates / type plate
@01:
lea eax, [esi + edi]
and eax, -2
shl eax, 5
mov ebx, dword ptr plates[eax - type plate]
bswap ebx
shr eax, 6
cmp ebx, edx
je @04
jl @02
lea edi, [eax - 1]
jmp @03
@02:
lea esi, [eax + 1]
@03:
cmp esi, edi
jle @01
dw op_ud2
@04:
ret
where plates is an array of plate:
//------------------------------------------------
// List of all countries / nationalities
//
// - sign : license plate sign
//
// - stat : status
// ' ': official
// '*': unofficial
// - visit:
// '!': visited (non-hitching, comment only)
// 'V': visited
// ' ': unvisited
// - hitch:
// 'H': hitched-in
// '*': hitched through
// - local:
// '+': in-country ride with local
// '-': out-of-country ride with nationality //------------------------------------------------
type
plate = record
sign : tycona; // array [1..4] of char
stat : char;
visit: char;
hitch: char;
local: char;
cntyl: byte;
cntys: array [1..55] of char;
end;
Thanks,
Robert
This one cries out for a hash table!
Since you have a fixed set of codes, you can spend a little time finding a fast
hash that converts this into a single byte value with zero collisions, at which
point the hash entry is your country code?
Try something like c[0]+c[1]*2+c[2]*4 as a first attempt, then mix in a few XORsWhich would handsomely beat a binary lookup...
or rotations instead of those SHL only ops.
It should be doable in well under 10 cycles and maybe under 5...
If a near-perfect hash is too slow or hard to find, then you could allow a fixed
maximal number of collisions (like max two entries) and scan those, but that needs room for the original values in the hash table.
Robert Prins wrote:
How likely is it that that there's a faster way than doing just a binary
lookup, i.e. something like the current
mov edx, in_plate
bswap edx
mov esi, 1
mov edi, type plates / type plate
@01:
lea eax, [esi + edi]
and eax, -2
shl eax, 5
mov ebx, dword ptr plates[eax - type plate]
bswap ebx
shr eax, 6
cmp ebx, edx
je @04
jl @02
lea edi, [eax - 1]
jmp @03
@02:
lea esi, [eax + 1]
@03:
cmp esi, edi
jle @01
dw op_ud2
@04:
ret
where plates is an array of plate:
//------------------------------------------------
// List of all countries / nationalities
//
// - sign : license plate sign
//
// - stat : status
// ' ': official
// '*': unofficial
// - visit:
// '!': visited (non-hitching, comment only)
// 'V': visited
// ' ': unvisited
// - hitch:
// 'H': hitched-in
// '*': hitched through
// - local:
// '+': in-country ride with local
// '-': out-of-country ride with nationality
//------------------------------------------------
type
plate = record
sign : tycona; // array [1..4] of char
stat : char;
visit: char;
hitch: char;
local: char;
cntyl: byte;
cntys: array [1..55] of char;
end;
On 2023-02-15 08:14, Terje Mathisen wrote:
This one cries out for a hash table!
Yes, but with the 1/2/3 letter codes being the (un)official car licence
plate codes of all 195 independent countries, with some extra
odd-man-out ones like GBJ/GBG, and the oddest, SMA, I've given this a
try, but GNU's GPERF doesn't come up with anything. And yes, I've also
read <http://courses.cs.vt.edu/~cs3114/Fall10/Notes/Supplemental/p17-cichelli.pdf> ;)
Since you have a fixed set of codes, you can spend a little time
finding a fast hash that converts this into a single byte value with
zero collisions, at which point the hash entry is your country code?
I've tried I don't know how many variations, but nothing has come close,
and although it will be slower than hashing, it might well be that a
simple lookup of the first three of four most common
countries/nationalities (D/B/LT/GB) might be a decent compromise, I did
use this technique at a Belgian employer where the three most common nationalities were B, NL, and F, accounting for more than 90% of the
insured. The best I got out of GPERF was a table of 425 entries, but the email from 2016 didn't include the hash function. :( But being someone
who rarely, if ever, deletes something, it might still be around.
Try something like c[0]+c[1]*2+c[2]*4 as a first attempt, then mix inWhich would handsomely beat a binary lookup...
a few XORs or rotations instead of those SHL only ops.
It should be doable in well under 10 cycles and maybe under 5...
If a near-perfect hash is too slow or hard to find, then you could
allow a fixed maximal number of collisions (like max two entries) and
scan those, but that needs room for the original values in the hash
table.
I've got the code that Borland used in TP (4?/5?6/7) to do hashing, and
that should be (relatively) easy to adapt.
Robert
Can you please post the full list? comma, space or newline separated?
Terje
Robert Prins wrote:
On 2023-02-15 08:14, Terje Mathisen wrote:
This one cries out for a hash table!
Yes, but with the 1/2/3 letter codes being the (un)official car licence plate
codes of all 195 independent countries, with some extra odd-man-out ones like
GBJ/GBG, and the oddest, SMA, I've given this a try, but GNU's GPERF doesn't >> come up with anything. And yes, I've also read
<http://courses.cs.vt.edu/~cs3114/Fall10/Notes/Supplemental/p17-cichelli.pdf> ;)
Since you have a fixed set of codes, you can spend a little time finding a >>> fast hash that converts this into a single byte value with zero collisions, >>> at which point the hash entry is your country code?
I've tried I don't know how many variations, but nothing has come close, and >> although it will be slower than hashing, it might well be that a simple lookup
of the first three of four most common countries/nationalities (D/B/LT/GB) >> might be a decent compromise, I did use this technique at a Belgian employer >> where the three most common nationalities were B, NL, and F, accounting for >> more than 90% of the insured. The best I got out of GPERF was a table of 425 >> entries, but the email from 2016 didn't include the hash function. :( But
being someone who rarely, if ever, deletes something, it might still be around.
Try something like c[0]+c[1]*2+c[2]*4 as a first attempt, then mix in a few >>> XORs or rotations instead of those SHL only ops.Which would handsomely beat a binary lookup...
It should be doable in well under 10 cycles and maybe under 5...
If a near-perfect hash is too slow or hard to find, then you could allow a >>> fixed maximal number of collisions (like max two entries) and scan those, but
that needs room for the original values in the hash table.
I've got the code that Borland used in TP (4?/5?6/7) to do hashing, and that >> should be (relatively) easy to adapt.
Robert
On 2023-02-15 21:15, Terje Mathisen wrote:
Can you please post the full list? comma, space or newline separated?
(sign: 'A '; cnty: 'Austria ' ),
(sign: 'AFG '; cnty: 'Afghanistan ' ),
(sign: 'AG '; cnty: 'Antigua and Barbuda ' ),
(sign: 'AL '; cnty: 'Albania ' ),
(sign: 'AM '; cnty: 'Armenia ' ),
(sign: 'AND '; cnty: 'Andorra ' ),
(sign: 'ANG '; cnty: 'Angola ' ),
(sign: 'ARK '; cnty: 'Antarctica ' ),
(sign: 'ARU '; cnty: 'Aruba ' ),
(sign: 'AS '; cnty: 'Asturias ' ),
(sign: 'AUA '; cnty: 'Aruba ' ),
(sign: 'AUS '; cnty: 'Australia ' ),
(sign: 'AX '; cnty: 'Aland Islands ' ),
(sign: 'AXA '; cnty: 'Anguilla ' ),
(sign: 'AZ '; cnty: 'Azerbaijan ' ),
(sign: 'B '; cnty: 'Belgium ' ),
(sign: 'BD '; cnty: 'Bangladesh ' ),
(sign: 'BDS '; cnty: 'Barbados ' ),
(sign: 'BF '; cnty: 'Burkina Faso ' ),
(sign: 'BG '; cnty: 'Bulgaria ' ),
(sign: 'BH '; cnty: 'Belize ' ),
(sign: 'BHT '; cnty: 'Bhutan ' ),
(sign: 'BIH '; cnty: 'Bosnia and Herzegovina ' ),
(sign: 'BOL '; cnty: 'Bolivia ' ),
(sign: 'BR '; cnty: 'Brazil ' ),
(sign: 'BRN '; cnty: 'Bahrain ' ),
(sign: 'BRU '; cnty: 'Brunei ' ),
(sign: 'BS '; cnty: 'Bahamas ' ),
(sign: 'BUR '; cnty: 'Myanmar ' ),
(sign: 'BVI '; cnty: 'British Virgin Islands ' ),
(sign: 'BW '; cnty: 'Botswana ' ),
(sign: 'BY '; cnty: 'Belarus ' ),
(sign: 'BZ '; cnty: 'Belize ' ),
(sign: 'C '; cnty: 'Cuba ' ),
(sign: 'CAM '; cnty: 'Cameroon ' ),
(sign: 'CDN '; cnty: 'Canada ' ),
(sign: 'CGO '; cnty: 'Democratic Republic of the Congo ' ),
(sign: 'CH '; cnty: 'Switzerland ' ),
(sign: 'CHN '; cnty: 'People''s Republic of China ' ),
(sign: 'CI '; cnty: 'Ivory Coast (Cote d''Ivoire) ' ),
(sign: 'CL '; cnty: 'Sri Lanka ' ),
(sign: 'CO '; cnty: 'Colombia ' ),
(sign: 'COM '; cnty: 'Comoros ' ),
(sign: 'CR '; cnty: 'Costa Rica ' ),
(sign: 'CV '; cnty: 'Cape Verde ' ),
(sign: 'CY '; cnty: 'Cyprus ' ),
(sign: 'CYM '; cnty: 'Wales ' ),
(sign: 'CZ '; cnty: 'Czech Republic ' ),
(sign: 'D '; cnty: 'Germany ' ),
(sign: 'DJI '; cnty: 'Djibouti ' ),
(sign: 'DK '; cnty: 'Denmark ' ),
(sign: 'DOM '; cnty: 'Dominican Republic ' ),
(sign: 'DY '; cnty: 'Benin ' ),
(sign: 'DZ '; cnty: 'Algeria ' ),
(sign: 'E '; cnty: 'Spain ' ),
(sign: 'EAK '; cnty: 'Kenya ' ),
(sign: 'EAT '; cnty: 'Tanzania ' ),
(sign: 'EAU '; cnty: 'Uganda ' ),
(sign: 'EAZ '; cnty: 'Zanzibar ' ),
(sign: 'EC '; cnty: 'Ecuador ' ),
(sign: 'ENG '; cnty: 'England ' ),
(sign: 'ER '; cnty: 'Eritrea ' ),
(sign: 'ES '; cnty: 'El Salvador ' ),
(sign: 'EST '; cnty: 'Estonia ' ),
(sign: 'ET '; cnty: 'Egypt ' ),
(sign: 'ETH '; cnty: 'Ethiopia ' ),
(sign: 'F '; cnty: 'France ' ),
(sign: 'FIN '; cnty: 'Finland ' ),
(sign: 'FJI '; cnty: 'Fiji ' ),
(sign: 'FL '; cnty: 'Liechtenstein ' ),
(sign: 'FO '; cnty: 'Faroe Islands ' ),
(sign: 'FSM '; cnty: 'Federated States of Micronesia ' ),
(sign: 'G '; cnty: 'Gabon ' ),
(sign: 'GB '; cnty: 'United Kingdom (UK) ' ),
(sign: 'GBA '; cnty: 'Alderney ' ),
(sign: 'GBG '; cnty: 'Guernsey ' ),
(sign: 'GBJ '; cnty: 'Jersey ' ),
(sign: 'GBM '; cnty: 'Isle of Man ' ),
(sign: 'GBZ '; cnty: 'Gibraltar ' ),
(sign: 'GCA '; cnty: 'Guatemala ' ),
(sign: 'GE '; cnty: 'Georgia ' ),
(sign: 'GH '; cnty: 'Ghana ' ),
(sign: 'GQ '; cnty: 'Equatorial Guinea ' ),
(sign: 'GR '; cnty: 'Greece ' ),
(sign: 'GUY '; cnty: 'Guyana ' ),
(sign: 'GW '; cnty: 'Guinea-Bissau ' ),
(sign: 'H '; cnty: 'Hungary ' ),
(sign: 'HK '; cnty: 'Hong Kong ' ),
(sign: 'HKJ '; cnty: 'Jordan ' ),
(sign: 'HN '; cnty: 'Honduras ' ),
(sign: 'HR '; cnty: 'Croatia ' ),
(sign: 'I '; cnty: 'Italy ' ),
(sign: 'IL '; cnty: 'Israel ' ),
(sign: 'IND '; cnty: 'India ' ),
(sign: 'IR '; cnty: 'Iran ' ),
(sign: 'IRL '; cnty: 'Ireland ' ),
(sign: 'IRQ '; cnty: 'Iraq ' ),
(sign: 'IS '; cnty: 'Iceland ' ),
(sign: 'J '; cnty: 'Japan ' ),
(sign: 'JA '; cnty: 'Jamaica ' ),
(sign: 'K '; cnty: 'Cambodia ' ),
(sign: 'KAN '; cnty: 'Saint Kitts and Nevis ' ),
(sign: 'KGZ '; cnty: 'Kyrgyzstan ' ),
(sign: 'KIR '; cnty: 'Kiribati ' ),
(sign: 'KN '; cnty: 'Greenland ' ),
(sign: 'KP '; cnty: 'Democratic People''s Republic of Korea ' ),
(sign: 'KS '; cnty: 'Kyrgyzstan ' ),
(sign: 'KSA '; cnty: 'Saudi Arabia ' ),
(sign: 'KWT '; cnty: 'Kuwait ' ),
(sign: 'KZ '; cnty: 'Kazakhstan ' ),
(sign: 'L '; cnty: 'Luxembourg ' ),
(sign: 'LAO '; cnty: 'Laos ' ),
(sign: 'LAR '; cnty: 'Libya ' ),
(sign: 'LB '; cnty: 'Liberia ' ),
(sign: 'LS '; cnty: 'Lesotho ' ),
(sign: 'LT '; cnty: 'Lithuania ' ),
(sign: 'LV '; cnty: 'Latvia ' ),
(sign: 'M '; cnty: 'Malta ' ),
(sign: 'MA '; cnty: 'Morocco ' ),
(sign: 'MAL '; cnty: 'Malaysia ' ),
(sign: 'MC '; cnty: 'Monaco ' ),
(sign: 'MD '; cnty: 'Moldova ' ),
(sign: 'MEX '; cnty: 'Mexico ' ),
(sign: 'MGL '; cnty: 'Mongolia ' ),
(sign: 'MH '; cnty: 'Marshall Islands ' ),
(sign: 'MK '; cnty: 'Macedonia (NMK) ' ),
(sign: 'MNE '; cnty: 'Montenegro ' ),
(sign: 'MO '; cnty: 'Macau ' ),
(sign: 'MOC '; cnty: 'Mozambique ' ),
(sign: 'MS '; cnty: 'Mauritius ' ),
(sign: 'MV '; cnty: 'Maldives ' ),
(sign: 'MW '; cnty: 'Malawi ' ),
(sign: 'N '; cnty: 'Norway ' ),
(sign: 'NA '; cnty: 'Netherlands Antilles ' ),
(sign: 'NAM '; cnty: 'Namibia ' ),
(sign: 'NAU '; cnty: 'Nauru ' ),
(sign: 'NC '; cnty: 'New Caledonia ' ),
(sign: 'NEP '; cnty: 'Nepal ' ),
(sign: 'NI '; cnty: 'Northern Ireland ' ),
(sign: 'NIC '; cnty: 'Nicaragua ' ),
(sign: 'NL '; cnty: 'Netherlands ' ),
(sign: 'NMK '; cnty: 'North Macedonia ' ),
(sign: 'NZ '; cnty: 'New Zealand ' ),
(sign: 'OM '; cnty: 'Oman ' ),
(sign: 'P '; cnty: 'Portugal ' ),
(sign: 'PA '; cnty: 'Panama ' ),
(sign: 'PAL '; cnty: 'Palau ' ),
(sign: 'PE '; cnty: 'Peru ' ),
(sign: 'PK '; cnty: 'Pakistan ' ),
(sign: 'PL '; cnty: 'Poland ' ),
(sign: 'PMR '; cnty: 'Transnistria ' ),
(sign: 'PNG '; cnty: 'Papua New Guinea ' ),
(sign: 'PR '; cnty: 'Puerto Rico ' ),
(sign: 'PS '; cnty: 'Palestine ' ),
(sign: 'PY '; cnty: 'Paraguay ' ),
(sign: 'Q '; cnty: 'Qatar ' ),
(sign: 'RA '; cnty: 'Argentina ' ),
(sign: 'RB '; cnty: 'Botswana ' ),
(sign: 'RC '; cnty: 'Republic of China (Taiwan) ' ),
(sign: 'RCA '; cnty: 'Central African Republic ' ),
(sign: 'RCB '; cnty: 'Republic of the Congo ' ),
(sign: 'RCH '; cnty: 'Chile ' ),
(sign: 'RG '; cnty: 'Guinea ' ),
(sign: 'RGB '; cnty: 'Guinea-Bissau ' ),
(sign: 'RH '; cnty: 'Haiti ' ),
(sign: 'RI '; cnty: 'Indonesia ' ),
(sign: 'RIM '; cnty: 'Mauritania ' ),
(sign: 'RKS '; cnty: 'Kosovo ' ),
(sign: 'RL '; cnty: 'Lebanon ' ),
(sign: 'RM '; cnty: 'Madagascar ' ),
(sign: 'RMM '; cnty: 'Mali ' ),
(sign: 'RN '; cnty: 'Niger ' ),
(sign: 'RO '; cnty: 'Romania ' ),
(sign: 'ROK '; cnty: 'Republic of Korea ' ),
(sign: 'ROU '; cnty: 'Uruguay (UY) ' ),
(sign: 'RP '; cnty: 'Philippines ' ),
(sign: 'RSM '; cnty: 'San Marino ' ),
(sign: 'RU '; cnty: 'Burundi ' ),
(sign: 'RUS '; cnty: 'Russia ' ),
(sign: 'RWA '; cnty: 'Rwanda ' ),
(sign: 'S '; cnty: 'Sweden ' ),
(sign: 'SCO '; cnty: 'Scotland ' ),
(sign: 'SCV '; cnty: 'Vatican City ' ),
(sign: 'SD '; cnty: 'Swaziland ' ),
(sign: 'SF '; cnty: 'Finland (FIN) ' ),
(sign: 'SGP '; cnty: 'Singapore ' ),
(sign: 'SK '; cnty: 'Slovakia ' ),
(sign: 'SLE '; cnty: 'Sierra Leone ' ),
(sign: 'SLO '; cnty: 'Slovenia ' ),
(sign: 'SME '; cnty: 'Suriname ' ),
(sign: 'SMO '; cnty: 'Sovereign Military Order of Malta ' ),
(sign: 'SN '; cnty: 'Senegal ' ),
(sign: 'SO '; cnty: 'Somalia ' ),
(sign: 'SOL '; cnty: 'Solomon Islands ' ),
(sign: 'SRB '; cnty: 'Serbia ' ),
(sign: 'STP '; cnty: 'S o Tom‚ and Pr¡ncipe ' ), (sign: 'SUD '; cnty: 'Sudan ' ),
(sign: 'SY '; cnty: 'Seychelles ' ),
(sign: 'SYR '; cnty: 'Syria ' ),
(sign: 'T '; cnty: 'Thailand ' ),
(sign: 'TCH '; cnty: 'Chad ' ),
(sign: 'TD '; cnty: 'Chad ' ),
(sign: 'TG '; cnty: 'Togo ' ),
(sign: 'TJ '; cnty: 'Tajikistan ' ),
(sign: 'TL '; cnty: 'Timor-Leste ' ),
(sign: 'TM '; cnty: 'Turkmenistan ' ),
(sign: 'TN '; cnty: 'Tunisia ' ),
(sign: 'TO '; cnty: 'Tonga ' ),
(sign: 'TR '; cnty: 'Turkey ' ),
(sign: 'TT '; cnty: 'Trinidad and Tobago ' ),
(sign: 'TUV '; cnty: 'Tuvalu ' ),
(sign: 'UA '; cnty: 'Ukraine ' ),
(sign: 'UAE '; cnty: 'United Arab Emirates ' ),
(sign: 'USA '; cnty: 'United States ' ),
(sign: 'UY '; cnty: 'Uruguay ' ),
(sign: 'UZ '; cnty: 'Uzbekistan ' ),
(sign: 'V '; cnty: 'Vatican City ' ),
(sign: 'VN '; cnty: 'Vietnam ' ),
(sign: 'VU '; cnty: 'Vanuatu ' ),
(sign: 'WAG '; cnty: 'Gambia ' ),
(sign: 'WAL '; cnty: 'Sierra Leone ' ),
(sign: 'WAN '; cnty: 'Nigeria ' ),
(sign: 'WD '; cnty: 'Dominica ' ),
(sign: 'WG '; cnty: 'Grenada ' ),
(sign: 'WL '; cnty: 'Saint Lucia ' ),
(sign: 'WS '; cnty: 'Samoa ' ),
(sign: 'WSA '; cnty: 'Western Sahara ' ),
(sign: 'WV '; cnty: 'Saint Vincent and the Grenadines ' ),
(sign: 'YAR '; cnty: 'Yemen ' ),
(sign: 'YU '; cnty: 'Yugoslavia ' ),
(sign: 'YV '; cnty: 'Venezuela ' ),
(sign: 'Z '; cnty: 'Zambia ' ),
(sign: 'ZA '; cnty: 'South Africa ' ),
(sign: 'ZW '; cnty: 'Zimbabwe ' ),
And as you can see, there are multiple entries for four countries,
GB/UK,
MK/NMK,
ROU/UY, and
SF/FIN
which have decided to change their country codes after I had a first
ride with a national of them, the few other users of my program might be using the new codes, but as long as only one of the two is used,it
should not matter, but returning the same hash for both would be magic.
The table should possibly also have included DDR, SU to name just two countries that no longer exist. Are there more, don't really know...
Robert
Robert, are you intentionally trying to annoy me? :-)
My own N, Norway is missing!
I'll take a look. In the meantime, have you considered an extremely simple single-entry cache: Last country used?
I am very confident that it is likely that any given user will tend to pick up
subsequent rides from another car in the same area, right?
If you want to be really fancy then you back that single entry up with 3 or 7 more most recently used, pushing the oldest out when you find a new country. You
resort on each new country
Robert Prins wrote:
On 2023-02-15 21:15, Terje Mathisen wrote:
Can you please post the full list? comma, space or newline separated?
(sign: 'A '; cnty: 'Austria ' ),
(sign: 'AFG '; cnty: 'Afghanistan ' ),
(sign: 'AG '; cnty: 'Antigua and Barbuda ' ),
(sign: 'AL '; cnty: 'Albania ' ),
(sign: 'AM '; cnty: 'Armenia ' ),
(sign: 'AND '; cnty: 'Andorra ' ),
(sign: 'ANG '; cnty: 'Angola ' ),
(sign: 'ARK '; cnty: 'Antarctica ' ),
(sign: 'ARU '; cnty: 'Aruba ' ),
(sign: 'AS '; cnty: 'Asturias ' ),
(sign: 'AUA '; cnty: 'Aruba ' ),
(sign: 'AUS '; cnty: 'Australia ' ),
(sign: 'AX '; cnty: 'ÂAland Islands ' ),
(sign: 'AXA '; cnty: 'Anguilla ' ),
(sign: 'AZ '; cnty: 'Azerbaijan ' ),
(sign: 'B '; cnty: 'Belgium ' ),
(sign: 'BD '; cnty: 'Bangladesh ' ),
(sign: 'BDS '; cnty: 'Barbados ' ),
(sign: 'BF '; cnty: 'Burkina Faso ' ),
(sign: 'BG '; cnty: 'Bulgaria ' ),
(sign: 'BH '; cnty: 'Belize ' ),
(sign: 'BHT '; cnty: 'Bhutan ' ),
(sign: 'BIH '; cnty: 'Bosnia and Herzegovina ' ),
(sign: 'BOL '; cnty: 'Bolivia ' ),
(sign: 'BR '; cnty: 'Brazil ' ),
(sign: 'BRN '; cnty: 'Bahrain ' ),
(sign: 'BRU '; cnty: 'Brunei ' ),
(sign: 'BS '; cnty: 'Bahamas ' ),
(sign: 'BUR '; cnty: 'Myanmar ' ),
(sign: 'BVI '; cnty: 'British Virgin Islands ' ),
(sign: 'BW '; cnty: 'Botswana ' ),
(sign: 'BY '; cnty: 'Belarus ' ),
(sign: 'BZ '; cnty: 'Belize ' ),
(sign: 'C '; cnty: 'Cuba ' ),
(sign: 'CAM '; cnty: 'Cameroon ' ),
(sign: 'CDN '; cnty: 'Canada ' ),
(sign: 'CGO '; cnty: 'Democratic Republic of the Congo ' ),
(sign: 'CH '; cnty: 'Switzerland ' ),
(sign: 'CHN '; cnty: 'People''s Republic of China ' ),
(sign: 'CI '; cnty: 'Ivory Coast (Cote d''Ivoire) ' ),
(sign: 'CL '; cnty: 'Sri Lanka ' ),
(sign: 'CO '; cnty: 'Colombia ' ),
(sign: 'COM '; cnty: 'Comoros ' ),
(sign: 'CR '; cnty: 'Costa Rica ' ),
(sign: 'CV '; cnty: 'Cape Verde ' ),
(sign: 'CY '; cnty: 'Cyprus ' ),
(sign: 'CYM '; cnty: 'Wales ' ),
(sign: 'CZ '; cnty: 'Czech Republic ' ),
(sign: 'D '; cnty: 'Germany ' ),
(sign: 'DJI '; cnty: 'Djibouti ' ),
(sign: 'DK '; cnty: 'Denmark ' ),
(sign: 'DOM '; cnty: 'Dominican Republic ' ),
(sign: 'DY '; cnty: 'Benin ' ),
(sign: 'DZ '; cnty: 'Algeria ' ),
(sign: 'E '; cnty: 'Spain ' ),
(sign: 'EAK '; cnty: 'Kenya ' ),
(sign: 'EAT '; cnty: 'Tanzania ' ),
(sign: 'EAU '; cnty: 'Uganda ' ),
(sign: 'EAZ '; cnty: 'Zanzibar ' ),
(sign: 'EC '; cnty: 'Ecuador ' ),
(sign: 'ENG '; cnty: 'England ' ),
(sign: 'ER '; cnty: 'Eritrea ' ),
(sign: 'ES '; cnty: 'El Salvador ' ),
(sign: 'EST '; cnty: 'Estonia ' ),
(sign: 'ET '; cnty: 'Egypt ' ),
(sign: 'ETH '; cnty: 'Ethiopia ' ),
(sign: 'F '; cnty: 'France ' ),
(sign: 'FIN '; cnty: 'Finland ' ),
(sign: 'FJI '; cnty: 'Fiji ' ),
(sign: 'FL '; cnty: 'Liechtenstein ' ),
(sign: 'FO '; cnty: 'Faroe Islands ' ),
(sign: 'FSM '; cnty: 'Federated States of Micronesia ' ),
(sign: 'G '; cnty: 'Gabon ' ),
(sign: 'GB '; cnty: 'United Kingdom (UK) ' ),
(sign: 'GBA '; cnty: 'Alderney ' ),
(sign: 'GBG '; cnty: 'Guernsey ' ),
(sign: 'GBJ '; cnty: 'Jersey ' ),
(sign: 'GBM '; cnty: 'Isle of Man ' ),
(sign: 'GBZ '; cnty: 'Gibraltar ' ),
(sign: 'GCA '; cnty: 'Guatemala ' ),
(sign: 'GE '; cnty: 'Georgia ' ),
(sign: 'GH '; cnty: 'Ghana ' ),
(sign: 'GQ '; cnty: 'Equatorial Guinea ' ),
(sign: 'GR '; cnty: 'Greece ' ),
(sign: 'GUY '; cnty: 'Guyana ' ),
(sign: 'GW '; cnty: 'Guinea-Bissau ' ),
(sign: 'H '; cnty: 'Hungary ' ),
(sign: 'HK '; cnty: 'Hong Kong ' ),
(sign: 'HKJ '; cnty: 'Jordan ' ),
(sign: 'HN '; cnty: 'Honduras ' ),
(sign: 'HR '; cnty: 'Croatia ' ),
(sign: 'I '; cnty: 'Italy ' ),
(sign: 'IL '; cnty: 'Israel ' ),
(sign: 'IND '; cnty: 'India ' ),
(sign: 'IR '; cnty: 'Iran ' ),
(sign: 'IRL '; cnty: 'Ireland ' ),
(sign: 'IRQ '; cnty: 'Iraq ' ),
(sign: 'IS '; cnty: 'Iceland ' ),
(sign: 'J '; cnty: 'Japan ' ),
(sign: 'JA '; cnty: 'Jamaica ' ),
(sign: 'K '; cnty: 'Cambodia ' ),
(sign: 'KAN '; cnty: 'Saint Kitts and Nevis ' ),
(sign: 'KGZ '; cnty: 'Kyrgyzstan ' ),
(sign: 'KIR '; cnty: 'Kiribati ' ),
(sign: 'KN '; cnty: 'Greenland ' ),
(sign: 'KP '; cnty: 'Democratic People''s Republic of Korea ' ), >> (sign: 'KS '; cnty: 'Kyrgyzstan ' ),
(sign: 'KSA '; cnty: 'Saudi Arabia ' ),
(sign: 'KWT '; cnty: 'Kuwait ' ),
(sign: 'KZ '; cnty: 'Kazakhstan ' ),
(sign: 'L '; cnty: 'Luxembourg ' ),
(sign: 'LAO '; cnty: 'Laos ' ),
(sign: 'LAR '; cnty: 'Libya ' ),
(sign: 'LB '; cnty: 'Liberia ' ),
(sign: 'LS '; cnty: 'Lesotho ' ),
(sign: 'LT '; cnty: 'Lithuania ' ),
(sign: 'LV '; cnty: 'Latvia ' ),
(sign: 'M '; cnty: 'Malta ' ),
(sign: 'MA '; cnty: 'Morocco ' ),
(sign: 'MAL '; cnty: 'Malaysia ' ),
(sign: 'MC '; cnty: 'Monaco ' ),
(sign: 'MD '; cnty: 'Moldova ' ),
(sign: 'MEX '; cnty: 'Mexico ' ),
(sign: 'MGL '; cnty: 'Mongolia ' ),
(sign: 'MH '; cnty: 'Marshall Islands ' ),
(sign: 'MK '; cnty: 'Macedonia (NMK) ' ),
(sign: 'MNE '; cnty: 'Montenegro ' ),
(sign: 'MO '; cnty: 'Macau ' ),
(sign: 'MOC '; cnty: 'Mozambique ' ),
(sign: 'MS '; cnty: 'Mauritius ' ),
(sign: 'MV '; cnty: 'Maldives ' ),
(sign: 'MW '; cnty: 'Malawi ' ),
(sign: 'N '; cnty: 'Norway ' ),
(sign: 'NA '; cnty: 'Netherlands Antilles ' ),
(sign: 'NAM '; cnty: 'Namibia ' ),
(sign: 'NAU '; cnty: 'Nauru ' ),
(sign: 'NC '; cnty: 'New Caledonia ' ),
(sign: 'NEP '; cnty: 'Nepal ' ),
(sign: 'NI '; cnty: 'Northern Ireland ' ),
(sign: 'NIC '; cnty: 'Nicaragua ' ),
(sign: 'NL '; cnty: 'Netherlands ' ),
(sign: 'NMK '; cnty: 'North Macedonia ' ),
(sign: 'NZ '; cnty: 'New Zealand ' ),
(sign: 'OM '; cnty: 'Oman ' ),
(sign: 'P '; cnty: 'Portugal ' ),
(sign: 'PA '; cnty: 'Panama ' ),
(sign: 'PAL '; cnty: 'Palau ' ),
(sign: 'PE '; cnty: 'Peru ' ),
(sign: 'PK '; cnty: 'Pakistan ' ),
(sign: 'PL '; cnty: 'Poland ' ),
(sign: 'PMR '; cnty: 'Transnistria ' ),
(sign: 'PNG '; cnty: 'Papua New Guinea ' ),
(sign: 'PR '; cnty: 'Puerto Rico ' ),
(sign: 'PS '; cnty: 'Palestine ' ),
(sign: 'PY '; cnty: 'Paraguay ' ),
(sign: 'Q '; cnty: 'Qatar ' ),
(sign: 'RA '; cnty: 'Argentina ' ),
(sign: 'RB '; cnty: 'Botswana ' ),
(sign: 'RC '; cnty: 'Republic of China (Taiwan) ' ),
(sign: 'RCA '; cnty: 'Central African Republic ' ),
(sign: 'RCB '; cnty: 'Republic of the Congo ' ),
(sign: 'RCH '; cnty: 'Chile ' ),
(sign: 'RG '; cnty: 'Guinea ' ),
(sign: 'RGB '; cnty: 'Guinea-Bissau ' ),
(sign: 'RH '; cnty: 'Haiti ' ),
(sign: 'RI '; cnty: 'Indonesia ' ),
(sign: 'RIM '; cnty: 'Mauritania ' ),
(sign: 'RKS '; cnty: 'Kosovo ' ),
(sign: 'RL '; cnty: 'Lebanon ' ),
(sign: 'RM '; cnty: 'Madagascar ' ),
(sign: 'RMM '; cnty: 'Mali ' ),
(sign: 'RN '; cnty: 'Niger ' ),
(sign: 'RO '; cnty: 'Romania ' ),
(sign: 'ROK '; cnty: 'Republic of Korea ' ),
(sign: 'ROU '; cnty: 'Uruguay (UY) ' ),
(sign: 'RP '; cnty: 'Philippines ' ),
(sign: 'RSM '; cnty: 'San Marino ' ),
(sign: 'RU '; cnty: 'Burundi ' ),
(sign: 'RUS '; cnty: 'Russia ' ),
(sign: 'RWA '; cnty: 'Rwanda ' ),
(sign: 'S '; cnty: 'Sweden ' ),
(sign: 'SCO '; cnty: 'Scotland ' ),
(sign: 'SCV '; cnty: 'Vatican City ' ),
(sign: 'SD '; cnty: 'Swaziland ' ),
(sign: 'SF '; cnty: 'Finland (FIN) ' ),
(sign: 'SGP '; cnty: 'Singapore ' ),
(sign: 'SK '; cnty: 'Slovakia ' ),
(sign: 'SLE '; cnty: 'Sierra Leone ' ),
(sign: 'SLO '; cnty: 'Slovenia ' ),
(sign: 'SME '; cnty: 'Suriname ' ),
(sign: 'SMO '; cnty: 'Sovereign Military Order of Malta ' ), >> (sign: 'SN '; cnty: 'Senegal ' ),
(sign: 'SO '; cnty: 'Somalia ' ),
(sign: 'SOL '; cnty: 'Solomon Islands ' ),
(sign: 'SRB '; cnty: 'Serbia ' ),
(sign: 'STP '; cnty: 'S o Tom‚ and Pr¡ncipe ' ),
(sign: 'SUD '; cnty: 'Sudan ' ),
(sign: 'SY '; cnty: 'Seychelles ' ),
(sign: 'SYR '; cnty: 'Syria ' ),
(sign: 'T '; cnty: 'Thailand ' ),
(sign: 'TCH '; cnty: 'Chad ' ),
(sign: 'TD '; cnty: 'Chad ' ),
(sign: 'TG '; cnty: 'Togo ' ),
(sign: 'TJ '; cnty: 'Tajikistan ' ),
(sign: 'TL '; cnty: 'Timor-Leste ' ),
(sign: 'TM '; cnty: 'Turkmenistan ' ),
(sign: 'TN '; cnty: 'Tunisia ' ),
(sign: 'TO '; cnty: 'Tonga ' ),
(sign: 'TR '; cnty: 'Turkey ' ),
(sign: 'TT '; cnty: 'Trinidad and Tobago ' ),
(sign: 'TUV '; cnty: 'Tuvalu ' ),
(sign: 'UA '; cnty: 'Ukraine ' ),
(sign: 'UAE '; cnty: 'United Arab Emirates ' ),
(sign: 'USA '; cnty: 'United States ' ),
(sign: 'UY '; cnty: 'Uruguay ' ),
(sign: 'UZ '; cnty: 'Uzbekistan ' ),
(sign: 'V '; cnty: 'Vatican City ' ),
(sign: 'VN '; cnty: 'Vietnam ' ),
(sign: 'VU '; cnty: 'Vanuatu ' ),
(sign: 'WAG '; cnty: 'Gambia ' ),
(sign: 'WAL '; cnty: 'Sierra Leone ' ),
(sign: 'WAN '; cnty: 'Nigeria ' ),
(sign: 'WD '; cnty: 'Dominica ' ),
(sign: 'WG '; cnty: 'Grenada ' ),
(sign: 'WL '; cnty: 'Saint Lucia ' ),
(sign: 'WS '; cnty: 'Samoa ' ),
(sign: 'WSA '; cnty: 'Western Sahara ' ),
(sign: 'WV '; cnty: 'Saint Vincent and the Grenadines ' ),
(sign: 'YAR '; cnty: 'Yemen ' ),
(sign: 'YU '; cnty: 'Yugoslavia ' ),
(sign: 'YV '; cnty: 'Venezuela ' ),
(sign: 'Z '; cnty: 'Zambia ' ),
(sign: 'ZA '; cnty: 'South Africa ' ),
(sign: 'ZW '; cnty: 'Zimbabwe ' ),
And as you can see, there are multiple entries for four countries,
GB/UK,
MK/NMK,
ROU/UY, and
SF/FIN
which have decided to change their country codes after I had a first ride with
a national of them, the few other users of my program might be using the new >> codes, but as long as only one of the two is used,it should not matter, but >> returning the same hash for both would be magic. The table should possibly >> also have included DDR, SU to name just two countries that no longer exist. >> Are there more, don't really know...
Robert
On 2023-02-23 20:59, Terje Mathisen wrote:
Robert, are you intentionally trying to annoy me? :-)
My own N, Norway is missing!
Please look again! My program would crash, in 1982 I hitched all the way
up to Nordkapp, that wonderful place where people only go to to wrongly
tell their friends that they've been at the northernmost point of
Europe. ;)
I'll take a look. In the meantime, have you considered an extremely
simple single-entry cache: Last country used?
That might very well be the best solution
I am very confident that it is likely that any given user will tend to
pick up subsequent rides from another car in the same area, right?
That's certainly true
If you want to be really fancy then you back that single entry up with
3 or 7 more most recently used, pushing the oldest out when you find a
new country. You resort on each new country
That's what I used in 1996 while working for KLM while building a
prototype payload prediction system (in PL/I). There I didn't sort, but
just added a counter and threw out the item with the lowest value, and
moved the new value into its location, and, obviously, in location 0 of
the array.
In this case the extra overhead might be more than doing the binary
lookup? Probably the only way to find out would be to do something in
pure DOS using the full set of data that's now going into the binary
lookup, an extra "writeln()" should produce it, or I could use the
"COUNT" option of the old z/OS PL/I compiler to give me the, what else, counts of executed PL/I statements, which would give at least a rough
idea as to what's the way to go.
I'll give it a try, once I've managed to figure out what I'm doing wrong
with this mainframe PL/I version. Give me some time, family things are
right now sadly taking up rather a lot of time, and don't spoil me with
a, if there's one, ready made solution!
Robert
Robert Prins wrote:
On 2023-02-23 20:59, Terje Mathisen wrote:
Robert, are you intentionally trying to annoy me? :-)
My own N, Norway is missing!
Please look again! My program would crash, in 1982 I hitched all the way up to
Nordkapp, that wonderful place where people only go to to wrongly tell their >> friends that they've been at the northernmost point of Europe. ;)
Yeah, by my guess at least 99.99+% of all visitors to Nordkapp think they have
been to said northernmost point in Europe:
If you want "continental Europe", then the old ferry, now undersea tunnel to the
Magerøya island would disqualify Nordkapp. If you otoh allow this, like I would
do as well since there is a small city on the island, then your map should show
you the 3-5 hour hike needed to get from the main road to the real tip of the island:
https://mapant.no/?#7/71.184/25.674 Knivskjærodden
You could also make a very good case forAnd after this touristic break, let's return to the lookup ;)
https://mapant.no/?#8/71.130/27.648 Kinnarodden which is on the actual mainland,
just a bit further east and slightly longer south.
The difference of 0.054 degrees corresponds to about 6 km.
I'll take a look. In the meantime, have you considered an extremely simple >>> single-entry cache: Last country used?
That might very well be the best solution
I am very confident that it is likely that any given user will tend to pick >>> up subsequent rides from another car in the same area, right?
That's certainly true
If you want to be really fancy then you back that single entry up with 3 or 7
more most recently used, pushing the oldest out when you find a new country.
You resort on each new country
That's what I used in 1996 while working for KLM while building a prototype >> payload prediction system (in PL/I). There I didn't sort, but just added a >> counter and threw out the item with the lowest value, and moved the new value
into its location, and, obviously, in location 0 of the array.
In this case the extra overhead might be more than doing the binary lookup? >> Probably the only way to find out would be to do something in pure DOS using >> the full set of data that's now going into the binary lookup, an extra
"writeln()" should produce it, or I could use the "COUNT" option of the old >> z/OS PL/I compiler to give me the, what else, counts of executed PL/I
statements, which would give at least a rough idea as to what's the way to go.
I'll give it a try, once I've managed to figure out what I'm doing wrong with
this mainframe PL/I version. Give me some time, family things are right now >> sadly taking up rather a lot of time, and don't spoil me with a, if there's >> one, ready made solution!
Robert
On 2023-02-25 09:33, Terje Mathisen wrote:
Robert Prins wrote:
On 2023-02-23 20:59, Terje Mathisen wrote:
Robert, are you intentionally trying to annoy me? :-)
My own N, Norway is missing!
Please look again! My program would crash, in 1982 I hitched all the way up >>> to Nordkapp, that wonderful place where people only go to to wrongly tell >>> their friends that they've been at the northernmost point of Europe. ;)
Yeah, by my guess at least 99.99+% of all visitors to Nordkapp think they have
been to said northernmost point in Europe:
If you want "continental Europe", then the old ferry, now undersea tunnel to >> the Magerøya island would disqualify Nordkapp. If you otoh allow this, like I
would do as well since there is a small city on the island, then your map
should show you the 3-5 hour hike needed to get from the main road to the real
tip of the island:
https://mapant.no/?#7/71.184/25.674 Knivskjærodden
I know, although didn't in 1982, but that year hiking would have been all but impossible, as the place was still covered in a thick layer of snow, even though
I got there on 20 June and left on 21 June, without having seen the midsummer sun, as the place was also covered in dense fog. :(
You could also make a very good case forAnd after this touristic break, let's return to the lookup ;)
https://mapant.no/?#8/71.130/27.648 Kinnarodden which is on the actual
mainland, just a bit further east and slightly longer south.
The difference of 0.054 degrees corresponds to about 6 km.
Here's what comes out of the current PL/I version, where I actually cache the last country/nationality, and still get a 90% cache-miss, ouch.
Anyway, I'll double(!) the cache, and see what difference that makes. ;)
On 2023-02-25 13:21, Robert Prins wrote:
On 2023-02-25 09:33, Terje Mathisen wrote:
Robert Prins wrote:
On 2023-02-23 20:59, Terje Mathisen wrote:
Robert, are you intentionally trying to annoy me? :-)
My own N, Norway is missing!
Please look again! My program would crash, in 1982 I hitched all the
way up to Nordkapp, that wonderful place where people only go to to
wrongly tell their friends that they've been at the northernmost
point of Europe. ;)
Yeah, by my guess at least 99.99+% of all visitors to Nordkapp think
they have been to said northernmost point in Europe:
If you want "continental Europe", then the old ferry, now undersea
tunnel to the Magerøya island would disqualify Nordkapp. If you otoh
allow this, like I would do as well since there is a small city on
the island, then your map should show you the 3-5 hour hike needed to
get from the main road to the real tip of the island:
https://mapant.no/?#7/71.184/25.674 Knivskjærodden
I know, although didn't in 1982, but that year hiking would have been
all but impossible, as the place was still covered in a thick layer of
snow, even though I got there on 20 June and left on 21 June, without
having seen the midsummer sun, as the place was also covered in dense
fog. :(
You could also make a very good case forAnd after this touristic break, let's return to the lookup ;)
https://mapant.no/?#8/71.130/27.648 Kinnarodden which is on the
actual mainland, just a bit further east and slightly longer south.
The difference of 0.054 degrees corresponds to about 6 km.
Here's what comes out of the current PL/I version, where I actually
cache the last country/nationality, and still get a 90% cache-miss, ouch.
<snip PL/I>
Anyway, I'll double(!) the cache, and see what difference that makes. ;)
Not surprisingly, that doubles the hits, but I still miss 80%.
Any suggestions as to how efficiently quadruple the cache and still make
it more efficient to look up and discard values than the current cache
of just one entry and binary lookup would be welcome, but please don't spend/waste a lot of time on it.
Robert
Since you have a way to measure hit rates, you obviously have some real (or synthetic?) access order test data, right?
In that case you should give us a useful sample of that (maybe 10-25%) and let
us try various alternatives, then we can measure average access time, along with
max, min and various percentiles.
I have a very hard time seeing how this could ever take more than maybe 20-30 clock cycles, using a combination of an initial table lookup to hit the first entry starting with a given letter and then a scan for the actual country:
The worst case is R with 24 entries, so naively
movzx edi,al
mov edx,firstletter[edi*4 - 'A'*4]
mov ecx,25
repne scasd
Assuming REPNE is too slow we can convert to simpler code:
movzx edi,al
mov edx,firstletter[edi*4 - 'A'*4]
next:
cmp eax,[edi]
lea edi,[edi+4]
jb next
je found
jmp not_found
This will normally run in a single cycle/iteration, but with two cycles/iteration plus a branch miss at the end, it could take 50-60 cycles to determine that RZZ does not exist.
Looking at the country codes I'm tempted to use all 5 bits from the first char
and augment that with the 3 higher bits from the second, for a 256-entry lookup
table where the correct entry will be found in a maximum of 6 tries (RA, RB, RC,
RCA, RCB, RCH). All other country codes seems to hit in less than this, so with
a 16-bit index into the country table I get something like this:
; EAX has country code in space-filled field
mov ebx,eax
mov ecx,eax
and ebx,31 ;; 5 bits
shr ecx,10 ;; Bottom char
and ecx,7 ;; 3 bits
lea ebx,[ecx+eax*8] ;; Combine them
movzx ebx,word ptr country_index[ebx*2] ;; 5-8 cycle prefix
next:
cmp eax,country_table[ebx] ;; 4-char country code
lea ebx,[ebx+COUNTRY_RECORD_SIZE]
jb next ;; 1-2 cycle iteration
je found ;; 4-10 cycle branch miss
jmp invalid_country_code
This looks like a guaranteed hit in less than 25 cycles, including a single branch miss, as long as the country table and country_index[] (512 bytes) are in
$L1.
Terje
Robert Prins wrote:
On 2023-02-25 13:21, Robert Prins wrote:
On 2023-02-25 09:33, Terje Mathisen wrote:
Robert Prins wrote:
On 2023-02-23 20:59, Terje Mathisen wrote:Yeah, by my guess at least 99.99+% of all visitors to Nordkapp think they >>>> have been to said northernmost point in Europe:
Robert, are you intentionally trying to annoy me? :-)
My own N, Norway is missing!
Please look again! My program would crash, in 1982 I hitched all the way up
to Nordkapp, that wonderful place where people only go to to wrongly tell >>>>> their friends that they've been at the northernmost point of Europe. ;) >>>>
If you want "continental Europe", then the old ferry, now undersea tunnel to
the Magerøya island would disqualify Nordkapp. If you otoh allow this, like
I would do as well since there is a small city on the island, then your map
should show you the 3-5 hour hike needed to get from the main road to the >>>> real tip of the island:
https://mapant.no/?#7/71.184/25.674 Knivskjærodden
I know, although didn't in 1982, but that year hiking would have been all but
impossible, as the place was still covered in a thick layer of snow, even >>> though I got there on 20 June and left on 21 June, without having seen the >>> midsummer sun, as the place was also covered in dense fog. :(
You could also make a very good case forAnd after this touristic break, let's return to the lookup ;)
https://mapant.no/?#8/71.130/27.648 Kinnarodden which is on the actual >>>> mainland, just a bit further east and slightly longer south.
The difference of 0.054 degrees corresponds to about 6 km.
Here's what comes out of the current PL/I version, where I actually cache the
last country/nationality, and still get a 90% cache-miss, ouch.
<snip PL/I>
Anyway, I'll double(!) the cache, and see what difference that makes. ;)
Not surprisingly, that doubles the hits, but I still miss 80%.
Any suggestions as to how efficiently quadruple the cache and still make it >> more efficient to look up and discard values than the current cache of just >> one entry and binary lookup would be welcome, but please don't spend/waste a >> lot of time on it.
Robert
Since you have a way to measure hit rates, you obviously have some real (or synthetic?) access order test data, right?
In that case you should give us a useful sample of that (maybe 10-25%) and let
us try various alternatives, then we can measure average access time, along with
max, min and various percentiles.
I have a very hard time seeing how this could ever take more than maybe 20-30 clock cycles, using a combination of an initial table lookup to hit the first entry starting with a given letter and then a scan for the actual country:
The worst case is R with 24 entries, so naively
movzx edi,al
mov edx,firstletter[edi*4 - 'A'*4]
mov ecx,25
repne scasd
Assuming REPNE is too slow we can convert to simpler code:
movzx edi,al
mov edx,firstletter[edi*4 - 'A'*4]
next:
cmp eax,[edi]
lea edi,[edi+4]
jb next
je found
jmp not_found
This will normally run in a single cycle/iteration, but with two cycles/iteration plus a branch miss at the end, it could take 50-60 cycles to determine that RZZ does not exist.
Looking at the country codes I'm tempted to use all 5 bits from the first char
and augment that with the 3 higher bits from the second, for a 256-entry lookup
table where the correct entry will be found in a maximum of 6 tries (RA, RB, RC,
RCA, RCB, RCH). All other country codes seems to hit in less than this, so with
a 16-bit index into the country table I get something like this:
; EAX has country code in space-filled field
mov ebx,eax
mov ecx,eax
and ebx,31 ;; 5 bits
shr ecx,10 ;; Bottom char
and ecx,7 ;; 3 bits
lea ebx,[ecx+eax*8] ;; Combine them
movzx ebx,word ptr country_index[ebx*2] ;; 5-8 cycle prefix
next:
cmp eax,country_table[ebx] ;; 4-char country code
lea ebx,[ebx+COUNTRY_RECORD_SIZE]
jb next ;; 1-2 cycle iteration
je found ;; 4-10 cycle branch miss
jmp invalid_country_code
This looks like a guaranteed hit in less than 25 cycles, including a single branch miss, as long as the country table and country_index[] (512 bytes) are in
$L1.
On 2023-02-27 20:51, Terje Mathisen wrote:
Since you have a way to measure hit rates, you obviously have some
real (or synthetic?) access order test data, right?
In that case you should give us a useful sample of that (maybe 10-25%)
and let us try various alternatives, then we can measure average
access time, along with max, min and various percentiles.
I have a very hard time seeing how this could ever take more than
maybe 20-30 clock cycles, using a combination of an initial table
lookup to hit the first entry starting with a given letter and then a
scan for the actual country:
The worst case is R with 24 entries, so naively
movzx edi,al
mov edx,firstletter[edi*4 - 'A'*4]
mov ecx,25
repne scasd
This will work, as it's a straight compare for equality
Assuming REPNE is too slow we can convert to simpler code:
movzx edi,al
mov edx,firstletter[edi*4 - 'A'*4]
next:
cmp eax,[edi]
lea edi,[edi+4]
jb next
je found
jmp not_found
I may be wrong, but should you not have included BSWAP instructions,
like in my original binary lookup? (I cannot use MOVBE)
This will normally run in a single cycle/iteration, but with two
cycles/iteration plus a branch miss at the end, it could take 50-60
cycles to determine that RZZ does not exist.
Looking at the country codes I'm tempted to use all 5 bits from the
first char and augment that with the 3 higher bits from the second,
for a 256-entry lookup table where the correct entry will be found in
a maximum of 6 tries (RA, RB, RC, RCA, RCB, RCH). All other country
codes seems to hit in less than this, so with a 16-bit index into the
country table I get something like this:
; EAX has country code in space-filled field
mov ebx,eax
mov ecx,eax
and ebx,31 ;; 5 bits
shr ecx,10 ;; Bottom char
and ecx,7 ;; 3 bits
lea ebx,[ecx+eax*8] ;; Combine them
lea ebx,[ecx+ebx*8] ; I presume?
movzx ebx,word ptr country_index[ebx*2] ;; 5-8 cycle prefix
next:
cmp eax,country_table[ebx] ;; 4-char country code
lea ebx,[ebx+COUNTRY_RECORD_SIZE]
jb next ;; 1-2 cycle iteration
je found ;; 4-10 cycle branch miss
jmp invalid_country_code
This looks like a guaranteed hit in less than 25 cycles, including a
single branch miss, as long as the country table and country_index[]
(512 bytes) are in $L1.
OK, now I just have to figure out what's in those two tables.
Robert
To make this work with 32-bit compares we need the most significant letter (i.e.
the first) to be located in the last byte, i.e. the 4-letter country codes must
be stored in BE order in the table! We could also use a secondary table consisting of just the reversed country codes and a pointer to the actual record: With ~400 countries this would be 3.2KB, then we add 512 bytes for the
~1.6 letter lookup table which contains 16-bit offsets into the country table.
We end up with less than 4 KB total plus the 32/64 bytes per country for the actual table.
It is much better to require a BSWAP when using it later for other stuff (or simply storing both forms as in that extra helper table) than to require a BSWAP
in the critical path.
BTW, if you actually implement this idea, then I would also compress the country
table by stripping out all the extra spaces: The inverted country lookup table
would contain the actual address of each packed record. Doing this means that you might halve the size of the table and avoid the potentially bad issue of having all records aligned on a larger pwoer of two, since this can easily lead
to cache trashing via limited associativity. (Normally not an issue for anything
that fits in $L1!)
Terje
Robert Prins wrote:
On 2023-02-27 20:51, Terje Mathisen wrote:
Since you have a way to measure hit rates, you obviously have some real (or >>> synthetic?) access order test data, right?
In that case you should give us a useful sample of that (maybe 10-25%) and >>> let us try various alternatives, then we can measure average access time, >>> along with max, min and various percentiles.
I have a very hard time seeing how this could ever take more than maybe 20-30
clock cycles, using a combination of an initial table lookup to hit the first
entry starting with a given letter and then a scan for the actual country: >>>
The worst case is R with 24 entries, so naively
movzx edi,al
mov edx,firstletter[edi*4 - 'A'*4]
mov ecx,25
repne scasd
This will work, as it's a straight compare for equality
Assuming REPNE is too slow we can convert to simpler code:
movzx edi,al
mov edx,firstletter[edi*4 - 'A'*4]
next:
cmp eax,[edi]
lea edi,[edi+4]
jb next
je found
jmp not_found
I may be wrong, but should you not have included BSWAP instructions, like in >> my original binary lookup? (I cannot use MOVBE)
This will normally run in a single cycle/iteration, but with two
cycles/iteration plus a branch miss at the end, it could take 50-60 cycles to
determine that RZZ does not exist.
Looking at the country codes I'm tempted to use all 5 bits from the first >>> char and augment that with the 3 higher bits from the second, for a 256-entry
lookup table where the correct entry will be found in a maximum of 6 tries >>> (RA, RB, RC, RCA, RCB, RCH). All other country codes seems to hit in less >>> than this, so with a 16-bit index into the country table I get something like
this:
; EAX has country code in space-filled field
mov ebx,eax
mov ecx,eax
and ebx,31 ;; 5 bits
shr ecx,10 ;; Bottom char
and ecx,7 ;; 3 bits
lea ebx,[ecx+eax*8] ;; Combine them
lea ebx,[ecx+ebx*8] ; I presume?
movzx ebx,word ptr country_index[ebx*2] ;; 5-8 cycle prefix
next:
cmp eax,country_table[ebx] ;; 4-char country code
lea ebx,[ebx+COUNTRY_RECORD_SIZE]
jb next ;; 1-2 cycle iteration
je found ;; 4-10 cycle branch miss
jmp invalid_country_code
This looks like a guaranteed hit in less than 25 cycles, including a single >>> branch miss, as long as the country table and country_index[] (512 bytes) are
in $L1.
OK, now I just have to figure out what's in those two tables.
Robert
OK, I've gone for what I think is a pretty good solution:
My first try?
I added an array['A'..'Z'] of array [0..1], which contained [low, high]
for each entry, reducing the entries in the "do-while low <= high" loop
from a current 40,364 to 14,909, or about 63%.
But then I thought, "Mm, some countries starting with letter "?" seem to occur a lot more than other countries starting with letter "?", but will
only be found after a few binary chops... So why not expand the
['A'..'Z'] array to an array of [0..3] arrays, containing [low, high, most-used, 0] and on a no-match first try the "most-used" index?
Result? Only 2,823 entries into the "do-while low <= high" loop, a
reduction of just over 93%.
Do I make sense, and does this come anywhere close to your (Terje)
"almost all programming can be viewed as an exercise in caching" mantra?
Robert
This is much better!
I still think you should pay more attention to the cost of branches, i.e. a sequential scan of a shortish segment of the country array is very cheap, on the
order of 2-3 clock cycles/country, while a single mispredicted branch cost 8-20
cycles depending upon the CPU you are using.
Assuming this really is so important, why not "waste" a bit more lookup table RAM for a full 2-char index?
This is still only 26x27 (including space)->722 entries, each needing 16 bits so
1444 bytes.
Looking at your country list I see the following long IDs that need a short scan
after the lookup:
AND,ANG
ARK,ARU
AUA,AUS
AX,AXA
BD,BDS
BH,BHT
BR,BRN,BRU
CH,CHN
CO,COM
CY,CYM
EAK,EAT,EAU,EAZ
ES,EST
ET,ETH
GB,GBA,GBG,GBJ,GBM,GBZ
HK,HKJ
IT,IRL,IRQ
LAO,LAR
MA,MAL
MO,MOC
NA,NAM,NAU
NI,NIC
PA,PAL
RC,RCA,RCB,RCH
RG,RGB
RI,RIM
RM,RMM
RO,ROK,ROU
RU,RUS
SLE,SLO
SME,SMO
SO,SOL
SY,SYR
UA,UAE
WAG,WAL,WAN
WS,WSA
Terje
Robert Prins wrote:
OK, I've gone for what I think is a pretty good solution:
My first try?
I added an array['A'..'Z'] of array [0..1], which contained [low, high] for >> each entry, reducing the entries in the "do-while low <= high" loop from a >> current 40,364 to 14,909, or about 63%.
But then I thought, "Mm, some countries starting with letter "?" seem to occur
a lot more than other countries starting with letter "?", but will only be >> found after a few binary chops... So why not expand the ['A'..'Z'] array to an
array of [0..3] arrays, containing [low, high, most-used, 0] and on a no-match
first try the "most-used" index?
Result? Only 2,823 entries into the "do-while low <= high" loop, a reduction >> of just over 93%.
Do I make sense, and does this come anywhere close to your (Terje) "almost all
programming can be viewed as an exercise in caching" mantra?
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 407 |
Nodes: | 16 (2 / 14) |
Uptime: | 11:27:54 |
Calls: | 8,554 |
Calls today: | 6 |
Files: | 13,219 |
Messages: | 5,925,264 |