• Fuzzy matching for autosuggest

    From groovee@cyberdude.com@21:1/5 to All on Tue Jun 30 18:55:45 2020
    Hi,
    I have a web form, where I'd like what's typed in the search box to be matched against MySQL.

    Assume the DB has 3 entries:
    1) St. Xavier's
    2) St. Joseph's
    3) Birla High

    I'd like 1) If someone types "x", the autosuggest should show "St. Xavier's". ie. it should show x*, then *x*, then *x, in that order. How do I do this?
    2) If someone types "xaviers", ie. without the apostrophe, they should get "St. Xavier's" - how?
    3) Someone could type "st" or "st." (ie. with the full stop) or "saint" - they should all suggest "St. Xavier's" and "St. Joseph's" - how to do this?
    4) there should be "synonyms" - if someone types "hindi high", they should get "birla high" (hindi high was the old name, which has since changed to birla high, but people might still be searching for the old name).
    and 5) Someone could type "birla high mumbai" when the DB has "birla high, mumbai", ie. WITH the comma, in the DB - they should match. How do I do this?

    How would this be done? I really don't have much clue how to effect stuff like this in MySQL, I just know a MATCH AGAINST, and I don't really know how that works either. So - can anyone provide any pointers?


    Thanks.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kees Nuyt@21:1/5 to All on Wed Jul 1 19:45:14 2020
    On Tue, 30 Jun 2020 18:55:45 -0700 (PDT), groovee@cyberdude.com
    wrote:

    Hi,
    I have a web form, where I'd like what's typed in the search box to be matched against MySQL.

    Assume the DB has 3 entries:
    1) St. Xavier's
    2) St. Joseph's
    3) Birla High

    I'd like
    1) If someone types "x", the autosuggest should show "St. Xavier's".
    ie. it should show x*, then *x*, then *x, in that order.
    How do I do this?
    2) If someone types "xaviers", ie. without the apostrophe,
    they should get "St. Xavier's" - how?
    3) Someone could type "st" or "st." (ie. with the full stop)
    or "saint" - they should all suggest "St. Xavier's" and
    "St. Joseph's" - how to do this?
    4) there should be "synonyms" - if someone types "hindi high",
    they should get "birla high" (hindi high was the old name,
    which has since changed to birla high, but people might still be searching for the old name).
    and
    5) Someone could type "birla high mumbai" when the DB has "birla high, mumbai",
    ie. WITH the comma, in the DB - they should match. How do I do this?

    How would this be done? I really don't have much clue how to effect
    stuff like this in MySQL,
    I just know a MATCH AGAINST, and I don't really know how that works either. So - can anyone provide any pointers?

    This HTML solution comes close :

    <form action="index.php" method="post">
    <label for="membernm">Search</label>
    <input list="mnms" name="membernm" autofocus tabindex="11" placeholder="member name">
    <datalist id="mnms">
    <option value="Alice">
    <option value="Bob">
    <option value="Charles Brown">
    <option value="Charles Smith">
    </datalist>
    <input class="s" type="submit" name="butmnm" value="&#128270;"
    tabindex="19">
    </form>

    If that doesn't solve it, you are probably best off with
    normalizing the user's input using javascript
    and repopulate the datalist with matches from a normalized name
    column in the database. That's not easy to do, you may want to
    spend some time to find a framework that does the heavy lifting
    for you.

    You may need an aliases table with alternative spellings of the
    name and their normalized form.

    Normalization would remove spaces and punctuation, convert to
    lower case etc..

    Here is an example :
    CREATE TABLE Members (
    memberid INTEGER PRIMARY KEY
    ON CONFLICT IGNORE AUTOINCREMENT
    membernm VARCHAR NOT NULL -- canonical name
    , other columns, ....
    );

    CREATE TABLE IF NOT EXISTS MemberAliases (
    memberid INTEGER NOT NULL
    CONSTRAINT fk_mem_mid REFERENCES Members(memberid)
    ON UPDATE CASCADE ON DELETE CASCADE
    , memberalias VARCHAR NOT NULL
    , normalized VARCHAR NOT NULL INDEXED
    , CONSTRAINT pk_memnm PRIMARY KEY (memberid,memberalias)
    ON CONFLICT IGNORE
    );

    Notes:
    - The example happens to be about members, not schools
    - Add a row for the canonical name to MemberAliases
    - May not be 100% valid MySQL.

    --
    Regards,
    Kees Nuyt

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)