• Up to 50K inserts per second... smokin'!

    From DFS@21:1/5 to All on Wed Aug 25 14:07:11 2021
    Copying 30M rows (across 15 tables) from SQLite into a MariaDB store

    Original python DB-API code did single inserts, got 5.2K per second overall.

    Found some bulk insert ideas on stack exchange, finessed an
    implementation and got over 50K inserts per second on some data!

    "TABLEONE: 1.50 secs to post 75545 rows (50353 per sec)"

    With the new code the avg for all 30M rows across all tables was around
    16K per sec.

    Worked on the code for some hours to save 30-45 minutes in data loading
    time... but 50K inserts/sec is pretty darn good for my 11-year-old
    i5-750 8GB RAM system.

    Raise a glass!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jerry Stuckle@21:1/5 to DFS on Wed Aug 25 19:26:52 2021
    On 8/25/2021 2:07 PM, DFS wrote:
    Copying 30M rows (across 15 tables) from SQLite into a MariaDB store

    Original python DB-API code did single inserts, got 5.2K per second
    overall.

    Found some bulk insert ideas on stack exchange, finessed an
    implementation and got over 50K inserts per second on some data!

    "TABLEONE: 1.50 secs to post 75545 rows (50353 per sec)"

    With the new code the avg for all 30M rows across all tables was around
    16K per sec.

    Worked on the code for some hours to save 30-45 minutes in data loading time... but 50K inserts/sec is pretty darn good for my 11-year-old
    i5-750 8GB RAM system.

    Raise a glass!


    Very understandable. Each INSERT must be parsed and executed by the
    RDBMS. Parsing a text SQL statement and determining how to execute it is
    very time consuming, much more than actually doing the SQL statement.

    That's why enterprise-level databases like IBM's DB2 have a means to
    preprocess static SQL statements and build access plans to do the work.
    The resultant code doesn't actually submit the SQL statement but
    instead calls a function which executes the pre-built access plan.

    You're inserting multiple rows with one SQL statement does the same
    thing Your SQL statement only needs to be parse once and the means of
    INSERTing the data only determined once. From there on it's a simple
    manner of repeating the same code.

    And BTW - with so many rows you'd probably get even better response by
    DROPping any indices before the INSERT and recreating them after the INSERT.



    --
    ==================
    Remove the "x" from my email address
    Jerry Stuckle
    jstucklex@attglobal.net
    ==================

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Axel Schwenke@21:1/5 to Jerry Stuckle on Thu Aug 26 09:39:27 2021
    On 26.08.2021 01:26, Jerry Stuckle wrote:

    Very understandable.  Each INSERT must be parsed and executed by the RDBMS. Parsing a text SQL statement and determining how to execute it is very time consuming, much more than actually doing the SQL statement.

    That's why enterprise-level databases like IBM's DB2 have a means to preprocess static SQL statements and build access plans to do the work.  The resultant code doesn't actually submit the SQL statement but instead calls a function which executes the pre-built access plan.

    That's a typical Stuckle post. MySQL introduced prepared statements in
    version 4.1. That was somewhere around 2005. Whopping 16 years in the past.
    Yet still he tries to sell it as a feature specific to enterprise RDBMS...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From The Natural Philosopher@21:1/5 to Axel Schwenke on Thu Aug 26 09:44:59 2021
    On 26/08/2021 08:39, Axel Schwenke wrote:
    On 26.08.2021 01:26, Jerry Stuckle wrote:

    Very understandable.  Each INSERT must be parsed and executed by the RDBMS. >> Parsing a text SQL statement and determining how to execute it is very time >> consuming, much more than actually doing the SQL statement.

    That's why enterprise-level databases like IBM's DB2 have a means to
    preprocess static SQL statements and build access plans to do the work.  The
    resultant code doesn't actually submit the SQL statement but instead calls a >> function which executes the pre-built access plan.

    That's a typical Stuckle post. MySQL introduced prepared statements in version 4.1. That was somewhere around 2005. Whopping 16 years in the past. Yet still he tries to sell it as a feature specific to enterprise RDBMS...

    I didn't realize stucklehead was still alive.

    --
    “A leader is best When people barely know he exists. Of a good leader,
    who talks little,When his work is done, his aim fulfilled,They will say,
    “We did this ourselves.”

    ― Lao Tzu, Tao Te Ching

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jerry Stuckle@21:1/5 to Axel Schwenke on Thu Aug 26 10:53:24 2021
    On 8/26/2021 3:39 AM, Axel Schwenke wrote:
    On 26.08.2021 01:26, Jerry Stuckle wrote:

    Very understandable.  Each INSERT must be parsed and executed by the RDBMS. >> Parsing a text SQL statement and determining how to execute it is very time >> consuming, much more than actually doing the SQL statement.

    That's why enterprise-level databases like IBM's DB2 have a means to
    preprocess static SQL statements and build access plans to do the work.  The
    resultant code doesn't actually submit the SQL statement but instead calls a >> function which executes the pre-built access plan.

    That's a typical Stuckle post. MySQL introduced prepared statements in version 4.1. That was somewhere around 2005. Whopping 16 years in the past. Yet still he tries to sell it as a feature specific to enterprise RDBMS...


    MySQL's version of prepared statements is not at all like static SQL in enterprise level databases. But you don't know the difference.

    A short course. In DB2 the program is run through a preprocessor once.
    The preprocessor looks for EXEC DB2 statements. When it finds one, it analyzes te statement, builds an access plan and stores it in the
    database. It then comments out the EXEC DB2 statement and builds native language calls directly to the stored access plan to pass values and
    execute the plan. The SQL code is commented out and doesn't even appear
    in the compiled program.

    This is NOT the same as prepared statements, which still must be
    prepared every time the program hits the PREPARE statement. It is much
    faster.

    But you have no idea what a high performance databsae is. Compared to
    DB2, MySQL is a toy


    --
    ==================
    Remove the "x" from my email address
    Jerry Stuckle
    jstucklex@attglobal.net
    ==================

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Jerry Stuckle on Tue Sep 7 19:20:38 2021
    On 8/25/2021 7:26 PM, Jerry Stuckle wrote:
    On 8/25/2021 2:07 PM, DFS wrote:
    Copying 30M rows (across 15 tables) from SQLite into a MariaDB store

    And BTW - with so many rows you'd probably get even better response by DROPping any indices before the INSERT and recreating them after the
    INSERT.


    Created an Oracle Express 18c database and created 'skeleton' tables
    with no PKs or FKs or indexes of any kind. Just table, column name,
    datatype and NOT|NULL spec.

    TBL 1: 14.88 secs to post 2799305 rows (188096 per sec)
    TBL 2: 21.93 secs to post 3568564 rows (162707 per sec)
    TBL 3: 110.49 secs to post 14898302 rows (134840 per sec)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)