• Like vs Regexp

    From Piyush Nalawade@21:1/5 to All on Thu Jan 16 23:27:16 2020
    Which is more performant while pattern matching or filtering?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Axel Schwenke@21:1/5 to Piyush Nalawade on Fri Jan 17 10:58:28 2020
    On 17.01.2020 08:27, Piyush Nalawade wrote:
    Which is more performant while pattern matching or filtering?

    That question makes not much sense. Because

    1. if your query requires a REGEX match, then it requires a regex match.
    While you can write each LIKE expression as REGEX, this is not true in the other way

    2. even then it depends on the exact expression and the table (available indexes mostly)

    As a rule of thumb: if the search pattern starts at the beginning of the
    column (vulgo: if you have a constant prefix), MySQL can use an index to
    locate rows with that prefix. This will be much faster as the scan that is otherwise needed.

    Matches with LIKE do make use of that (WHERE `foo` LIKE 'bar%' _will_ use an index on column `foo`, but WHERE `foo` LIKE '%bar%' will not). But I'm not
    sure if the equivalent match with RLIKE will do the same. AFAIK a REGEX
    match is always executed as a scan. A quick experiment confirms this for MariaDB:


    MariaDB [sbtest]> select pad from sbtest1 where pad like '012456%'; +-------------------------------------------------------------+
    | pad | +-------------------------------------------------------------+
    | 01245634732-09451737819-64918947528-85335101638-90622724721 | +-------------------------------------------------------------+
    1 row in set (0.00 sec)

    MariaDB [sbtest]> explain select pad from sbtest1 where pad like '012456%'\G *************************** 1. row ***************************
    id: 1
    select_type: SIMPLE
    table: sbtest1
    type: range
    possible_keys: pad
    key: pad
    key_len: 60
    ref: NULL
    rows: 1
    Extra: Using where; Using index
    1 row in set (0.00 sec)


    note the "type: range" here. The prefix is used to scan only an index range. "rows: 1" is the estimated number of rows to be scanned.


    MariaDB [sbtest]> select pad from sbtest1 where pad rlike '^012456.*'; +-------------------------------------------------------------+
    | pad | +-------------------------------------------------------------+
    | 01245634732-09451737819-64918947528-85335101638-90622724721 | +-------------------------------------------------------------+
    1 row in set (0.45 sec)

    MariaDB [sbtest]> explain select pad from sbtest1 where pad rlike '^012456.*'\G *************************** 1. row ***************************
    id: 1
    select_type: SIMPLE
    table: sbtest1
    type: index
    possible_keys: NULL
    key: pad
    key_len: 60
    ref: NULL
    rows: 1232928
    Extra: Using where; Using index
    1 row in set (0.00 sec)

    Now it's "type: index" which denotes an index scan. This time "rows" shows
    the (estimated) number of rows in the table.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)