Forum: >>> Magnum BBS <<<

Up to 50K inserts per second... smokin'!

From DFS@21:1/5 to All on Wed Aug 25 14:07:11 2021

Copying 30M rows (across 15 tables) from SQLite into a MariaDB store

Original python DB-API code did single inserts, got 5.2K per second overall.

Found some bulk insert ideas on stack exchange, finessed an
implementation and got over 50K inserts per second on some data!

"TABLEONE: 1.50 secs to post 75545 rows (50353 per sec)"

With the new code the avg for all 30M rows across all tables was around
16K per sec.

Worked on the code for some hours to save 30-45 minutes in data loading
time... but 50K inserts/sec is pretty darn good for my 11-year-old
i5-750 8GB RAM system.

Raise a glass!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jerry Stuckle@21:1/5 to DFS on Wed Aug 25 19:26:52 2021

On 8/25/2021 2:07 PM, DFS wrote:

Copying 30M rows (across 15 tables) from SQLite into a MariaDB store

Original python DB-API code did single inserts, got 5.2K per second
overall.

Found some bulk insert ideas on stack exchange, finessed an
implementation and got over 50K inserts per second on some data!

"TABLEONE: 1.50 secs to post 75545 rows (50353 per sec)"

With the new code the avg for all 30M rows across all tables was around
16K per sec.

Worked on the code for some hours to save 30-45 minutes in data loading time... but 50K inserts/sec is pretty darn good for my 11-year-old
i5-750 8GB RAM system.

Raise a glass!

Very understandable. Each INSERT must be parsed and executed by the
RDBMS. Parsing a text SQL statement and determining how to execute it is
very time consuming, much more than actually doing the SQL statement.

That's why enterprise-level databases like IBM's DB2 have a means to
preprocess static SQL statements and build access plans to do the work.
The resultant code doesn't actually submit the SQL statement but
instead calls a function which executes the pre-built access plan.

You're inserting multiple rows with one SQL statement does the same
thing Your SQL statement only needs to be parse once and the means of
INSERTing the data only determined once. From there on it's a simple
manner of repeating the same code.

And BTW - with so many rows you'd probably get even better response by
DROPping any indices before the INSERT and recreating them after the INSERT.

--
==================
Remove the "x" from my email address
Jerry Stuckle
jstucklex@attglobal.net
==================

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Axel Schwenke@21:1/5 to Jerry Stuckle on Thu Aug 26 09:39:27 2021

On 26.08.2021 01:26, Jerry Stuckle wrote:

Very understandable. Each INSERT must be parsed and executed by the RDBMS. Parsing a text SQL statement and determining how to execute it is very time consuming, much more than actually doing the SQL statement.

That's why enterprise-level databases like IBM's DB2 have a means to preprocess static SQL statements and build access plans to do the work. The resultant code doesn't actually submit the SQL statement but instead calls a function which executes the pre-built access plan.

That's a typical Stuckle post. MySQL introduced prepared statements in
version 4.1. That was somewhere around 2005. Whopping 16 years in the past.
Yet still he tries to sell it as a feature specific to enterprise RDBMS...

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From The Natural Philosopher@21:1/5 to Axel Schwenke on Thu Aug 26 09:44:59 2021

On 26/08/2021 08:39, Axel Schwenke wrote:

On 26.08.2021 01:26, Jerry Stuckle wrote:

Very understandable. Each INSERT must be parsed and executed by the RDBMS. >> Parsing a text SQL statement and determining how to execute it is very time >> consuming, much more than actually doing the SQL statement.

That's why enterprise-level databases like IBM's DB2 have a means to
preprocess static SQL statements and build access plans to do the work. The
resultant code doesn't actually submit the SQL statement but instead calls a >> function which executes the pre-built access plan.

That's a typical Stuckle post. MySQL introduced prepared statements in version 4.1. That was somewhere around 2005. Whopping 16 years in the past. Yet still he tries to sell it as a feature specific to enterprise RDBMS...

I didn't realize stucklehead was still alive.

--
“A leader is best When people barely know he exists. Of a good leader,
who talks little,When his work is done, his aim fulfilled,They will say,
“We did this ourselves.”

― Lao Tzu, Tao Te Ching

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jerry Stuckle@21:1/5 to Axel Schwenke on Thu Aug 26 10:53:24 2021

On 8/26/2021 3:39 AM, Axel Schwenke wrote:

On 26.08.2021 01:26, Jerry Stuckle wrote:

Very understandable. Each INSERT must be parsed and executed by the RDBMS. >> Parsing a text SQL statement and determining how to execute it is very time >> consuming, much more than actually doing the SQL statement.

That's why enterprise-level databases like IBM's DB2 have a means to
preprocess static SQL statements and build access plans to do the work. The
resultant code doesn't actually submit the SQL statement but instead calls a >> function which executes the pre-built access plan.

That's a typical Stuckle post. MySQL introduced prepared statements in version 4.1. That was somewhere around 2005. Whopping 16 years in the past. Yet still he tries to sell it as a feature specific to enterprise RDBMS...

MySQL's version of prepared statements is not at all like static SQL in enterprise level databases. But you don't know the difference.

A short course. In DB2 the program is run through a preprocessor once.
The preprocessor looks for EXEC DB2 statements. When it finds one, it analyzes te statement, builds an access plan and stores it in the
database. It then comments out the EXEC DB2 statement and builds native language calls directly to the stored access plan to pass values and
execute the plan. The SQL code is commented out and doesn't even appear
in the compiled program.

This is NOT the same as prepared statements, which still must be
prepared every time the program hits the PREPARE statement. It is much
faster.

But you have no idea what a high performance databsae is. Compared to
DB2, MySQL is a toy

--
==================
Remove the "x" from my email address
Jerry Stuckle
jstucklex@attglobal.net
==================

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From DFS@21:1/5 to Jerry Stuckle on Tue Sep 7 19:20:38 2021

On 8/25/2021 7:26 PM, Jerry Stuckle wrote:

On 8/25/2021 2:07 PM, DFS wrote:

Copying 30M rows (across 15 tables) from SQLite into a MariaDB store

And BTW - with so many rows you'd probably get even better response by DROPping any indices before the INSERT and recreating them after the
INSERT.

Created an Oracle Express 18c database and created 'skeleton' tables
with no PKs or FKs or indexes of any kind. Just table, column name,
datatype and NOT|NULL spec.

TBL 1: 14.88 secs to post 2799305 rows (188096 per sec)
TBL 2: 21.93 secs to post 3568564 rows (162707 per sec)
TBL 3: 110.49 secs to post 14898302 rows (134840 per sec)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	296
Nodes:	16 (2 / 14)
Uptime:	47:04:03
Calls:	6,648
Files:	12,198
Messages:	5,329,919

Up to 50K inserts per second... smokin'!

Who's Online

System Info