• count consecutive records/events

    From Ammammata@21:1/5 to All on Tue Feb 11 15:54:29 2020
    let me make an example

    consider a database with over one million of records, i.e. all nba
    boxscores since 1940-something

    consider just a few fields, i.e.
    playerID, gameID, date, points, rebounds
    records are sorted by gameID
    I apply a couple of filters, i.e. points >= 30 and rebounds >=20

    how would you make a query to get the "streaks" of consecutive games
    for one player that fit the filters?
    How can I check if between two filtered records there is at least one that doesn't match the criteria and breaks the streak?

    does anyone of you work for Elias Sports Bureau? ;) :P
    or for www.basketball-references.com ...

    --
    /-\ /\/\ /\/\ /-\ /\/\ /\/\ /-\ T /-\
    -=- -=- -=- -=- -=- -=- -=- -=- - -=-
    ........... [ al lavoro ] ...........

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From The Natural Philosopher@21:1/5 to Ammammata on Tue Feb 11 16:03:40 2020
    On 11/02/2020 15:54, Ammammata wrote:
    let me make an example

    consider a database with over one million of records, i.e. all nba
    boxscores since 1940-something

    consider just a few fields, i.e.
    playerID, gameID, date, points, rebounds
    records are sorted by gameID
    I apply a couple of filters, i.e. points >= 30 and rebounds >=20

    how would you make a query to get the "streaks" of consecutive games
    for one player that fit the filters?

    That is where my experience suggests that trying to make SQL do the job
    is a waste of time and my mysql servers all have 'C' compilers...

    Its fairly trivial to scan a huge number of records returned by a simple
    query that MySql handles efficiently (especially if well indexed) and
    build a solution set of - in this case - players with long runs of good
    scores.

    If this is built as a linked list, it can be scanned after completion
    pretty well.

    Another approach might be to scan the database and build up 'streaks' on
    a 'by player' basis and insert into a new table instead of a linked
    list. Once done that table would be ripe for quick analysis.

    These are practical approaches to solving the proiblem. If you want hand
    wavey theoretical SQL constructs that take a year to run in real time,
    you want Jerry Stucklehead, not me.

    How can I check if between two filtered records there is at least one that doesn't match the criteria and breaks the streak?

    does anyone of you work for Elias Sports Bureau? ;) :P
    or for www.basketball-references.com ...



    --
    The New Left are the people they warned you about.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From J.O. Aho@21:1/5 to Ammammata on Tue Feb 11 20:37:19 2020
    On 11/02/2020 16.54, Ammammata wrote:
    let me make an example

    consider a database with over one million of records, i.e. all nba
    boxscores since 1940-something

    consider just a few fields, i.e.
    playerID, gameID, date, points, rebounds
    records are sorted by gameID
    I apply a couple of filters, i.e. points >= 30 and rebounds >=20

    how would you make a query to get the "streaks" of consecutive games
    for one player that fit the filters?
    How can I check if between two filtered records there is at least one that doesn't match the criteria and breaks the streak?

    If you want to do it in the database instead of in the presentation
    layer, then you need to look at cursors and temp tables

    https://dev.mysql.com/doc/refman/8.0/en/cursors.html

    https://dev.mysql.com/doc/refman/8.0/en/create-temporary-table.html


    Store the longest streak in the temp table, if you find a longer one you
    just delete the current "longest one".
    Just keep in mind that this will take time to calculate, so do not do
    this for every request, you may make a job that executes a procedure
    that makes the calculation and then stores the result to a table, which
    then is used to display the longest streak.

    A popular method is to do an check if the current streak is longer than
    the longest one when adding a new entry for the "boxer", as this will
    happen less frequently than someone looking for the information.


    If you have MariaDB instead of MySQL, you can use the graphdatabase to
    make the calculation, as it would be faster on this kind of queries (the
    draw back with the MariaDB is that the graphdatabase is just in memory),
    for more permanent solution you would need to look at neo4j or similar,
    which stores the database to disk.


    Still simplest is to use the presentation layer for figuring things out
    and I doubt your teacher will deduct points from your assignment for that.

    --

    //Aho

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ammammata@21:1/5 to All on Thu Feb 13 10:18:21 2020
    Il giorno Tue 11 Feb 2020 04:54:29p, *Ammammata* ha inviato su comp.databases.mysql il messaggio news:XnsAB60AC046C83Cammammatatiscalineti@148.251.67.112. Vediamo cosa ha scritto:

    how would you make a query to get the "streaks" of consecutive games
    for one player that fit the filters?


    thank you for sharing your ideas

    just FYI please note that also the mentioned stats website informs the user that the search can take a long time, or force a limit in the data set
    (i.e. spanning a maximum of five years)

    and for sure they don't have just a personal computer like me but a much
    more performing battery of servers

    --
    /-\ /\/\ /\/\ /-\ /\/\ /\/\ /-\ T /-\
    -=- -=- -=- -=- -=- -=- -=- -=- - -=-
    ........... [ al lavoro ] ...........

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)