Then, my wish-list: building that same indexes in parallel, starting
an individual thread for each database.
I don't know if this is possible, but if it is, it is worth a try.
Does someone made this?
Dear CV:up if you have a plattered drive.
On Thursday, July 15, 2021 at 4:56:55 PM UTC-7, CV wrote:
...
Then, my wish-list: building that same indexes in parallel, starting
an individual thread for each database.
I don't know if this is possible, but if it is, it is worth a try.Do you want one thread per index per database (will be slower, since the disk will be thrashing more), or just one thread per database (easier)?
Does someone made this?
Either sequence is a linear read of the database (potential time savings), but if they are not started at the "same time" the requested record might not still be in memory. And then sorting and production of the index will really start tearing things
David A. Smith
On Friday, July 16, 2021 at 4:43:21 PM UTC+3, dlzc wrote:up if you have a plattered drive.
Dear CV:
On Thursday, July 15, 2021 at 4:56:55 PM UTC-7, CV wrote:
...
Then, my wish-list: building that same indexes in parallel, starting
an individual thread for each database.
I don't know if this is possible, but if it is, it is worth a try.Do you want one thread per index per database (will be slower, since the disk will be thrashing more), or just one thread per database (easier)?
Does someone made this?
Either sequence is a linear read of the database (potential time savings), but if they are not started at the "same time" the requested record might not still be in memory. And then sorting and production of the index will really start tearing things
David A. SmithTable indexing is about using intensively as much as possible RAM memory in order to build aka tree structure, and saving the three to the hard disk.
When the index is too big to fit completely into the local RAM, the algorithm performs more cycles, and the users have the sensation that "the database has slowed down" (this applies both to indexing and read/write operations).
IMHO indexing more than a table at once would enforce the two programs to compete for the available RAM, and each would slow down the other.
Ella, Davidstructural := cdx and dbf with same name).
Thank you for your answers.
The computer acting as server runs windows server, is a really fast one (12 o 16 cores), the disk is a solid state drive and has 32 gb memory, so there are almost no hardware limits.
Thinking about the xharbour application is a 32 bit one, which uses no more than 50 Mb ram in the worst case so far.
The indexing process will run at night; this application is used 24 hs a day (with remote users at home, async data downloads, etc.), I don't want to run out of time when running this automated process, that is the reason of my message/need.
Should be one thread per database; as the database is used "exclusive" for indexing (I'm sure no other process will interfere), one index after the other, but many processes running in parallel recreating a set of indexes (many orders - all of them are
Regards,
©
Suggestion: explain to the admin thatONLY the indexing, and nothing else
- the indexing process needs a dedicated physical machine (NOT a VM) with a minimal Windows OS image (32 bits version) processor with many L1/L2 memory and high pace, 3 GB RAM, no Internet access, no end-user access, and your executable, which is doing
- before starting your executable, the necessary .DBF tables are copied onto that machineand IO management.
- after the indexing completes successfully, the tables and indexes are picked up from that machine
Database server engines like Oracle and MS SQL are running on dedicated server with no connection to end-users, and each version is tied to specific OS versions and hardware, because they have some multi-threading features, which require advanced RAM
Python and NodeJS are receiving user requests on different threads, but all those threads are using the RAM via time-sharing (one by one).are structural := cdx and dbf with same name).
As I've mentioned, in case of indexing the critical resource is the RAM.
HTH
Ella, David
Thank you for your answers.
The computer acting as server runs windows server, is a really fast one (12 o 16 cores), the disk is a solid state drive and has 32 gb memory, so there are almost no hardware limits.
Thinking about the xharbour application is a 32 bit one, which uses no more than 50 Mb ram in the worst case so far.
The indexing process will run at night; this application is used 24 hs a day (with remote users at home, async data downloads, etc.), I don't want to run out of time when running this automated process, that is the reason of my message/need.
Should be one thread per database; as the database is used "exclusive" for indexing (I'm sure no other process will interfere), one index after the other, but many processes running in parallel recreating a set of indexes (many orders - all of them
EllaRegards,
©
El sábado, 17 de julio de 2021 a la(s) 05:40:42 UTC-3, Ella Stern escribió:doing ONLY the indexing, and nothing else
Suggestion: explain to the admin that
- the indexing process needs a dedicated physical machine (NOT a VM) with a minimal Windows OS image (32 bits version) processor with many L1/L2 memory and high pace, 3 GB RAM, no Internet access, no end-user access, and your executable, which is
and IO management.- before starting your executable, the necessary .DBF tables are copied onto that machine
- after the indexing completes successfully, the tables and indexes are picked up from that machine
Database server engines like Oracle and MS SQL are running on dedicated server with no connection to end-users, and each version is tied to specific OS versions and hardware, because they have some multi-threading features, which require advanced RAM
are structural := cdx and dbf with same name).
Python and NodeJS are receiving user requests on different threads, but all those threads are using the RAM via time-sharing (one by one).
As I've mentioned, in case of indexing the critical resource is the RAM.
HTH
Ella, David
Thank you for your answers.
The computer acting as server runs windows server, is a really fast one (12 o 16 cores), the disk is a solid state drive and has 32 gb memory, so there are almost no hardware limits.
Thinking about the xharbour application is a 32 bit one, which uses no more than 50 Mb ram in the worst case so far.
The indexing process will run at night; this application is used 24 hs a day (with remote users at home, async data downloads, etc.), I don't want to run out of time when running this automated process, that is the reason of my message/need.
Should be one thread per database; as the database is used "exclusive" for indexing (I'm sure no other process will interfere), one index after the other, but many processes running in parallel recreating a set of indexes (many orders - all of them
Ella
Regards,
©
Thank you for your explanation.
No VM machines on that server, plenty of ram, enough speed in disk access... almost no limits in hardware.
The only limit is the available time frame to rebuild indexes in case it is needed.
How about a piece of code to do what I need to implement (or test)?
Is xharbour able to do that without errors?
Regards
Claudio Voskian
Il 17/07/2021 15:50, CV ha scritto:doing ONLY the indexing, and nothing else
El sábado, 17 de julio de 2021 a la(s) 05:40:42 UTC-3, Ella Stern escribió:
Suggestion: explain to the admin that
- the indexing process needs a dedicated physical machine (NOT a VM) with a minimal Windows OS image (32 bits version) processor with many L1/L2 memory and high pace, 3 GB RAM, no Internet access, no end-user access, and your executable, which is
RAM and IO management.- before starting your executable, the necessary .DBF tables are copied onto that machine
- after the indexing completes successfully, the tables and indexes are picked up from that machine
Database server engines like Oracle and MS SQL are running on dedicated server with no connection to end-users, and each version is tied to specific OS versions and hardware, because they have some multi-threading features, which require advanced
are structural := cdx and dbf with same name).
Python and NodeJS are receiving user requests on different threads, but all those threads are using the RAM via time-sharing (one by one).
As I've mentioned, in case of indexing the critical resource is the RAM. >>
HTH
Ella, David
Thank you for your answers.
The computer acting as server runs windows server, is a really fast one (12 o 16 cores), the disk is a solid state drive and has 32 gb memory, so there are almost no hardware limits.
Thinking about the xharbour application is a 32 bit one, which uses no more than 50 Mb ram in the worst case so far.
The indexing process will run at night; this application is used 24 hs a day (with remote users at home, async data downloads, etc.), I don't want to run out of time when running this automated process, that is the reason of my message/need.
Should be one thread per database; as the database is used "exclusive" for indexing (I'm sure no other process will interfere), one index after the other, but many processes running in parallel recreating a set of indexes (many orders - all of them
Ella
Regards,
©
Thank you for your explanation.
No VM machines on that server, plenty of ram, enough speed in disk access... almost no limits in hardware.
The only limit is the available time frame to rebuild indexes in case it is needed.
How about a piece of code to do what I need to implement (or test)?
Is xharbour able to do that without errors?
Regards
Claudio Voskian
Try it yourself adapting this pseudocode:
#ifdef __XHARBOUR__
#xtranslate hb_threadStart( <x,...> ) => StartThread( <x> )
#endif
#include "hbthread.ch"
procedure main
...
// test monothread
start_time:=seconds()
? "start single thread "+ time()
index1()
index2()
elap_time=seconds()
? "End:"+time()+" seconds:"+dctrim(elap_time-start_time)
? "Start multithread:"+ time()
? "Start thread 1:"+ time()
hb_threadStart( HB_THREAD_INHERIT_PUBLIC , @index1() )
? "Start thread 2:"+ time()
hb_threadStart( HB_THREAD_INHERIT_PUBLIC, @index2() )
wait ""
return
func index1()
local start_time:=seconds(),elap_time
ferase index
use ... exclusive
index on...
use
elap_time=seconds()
? "End thread 1:"+time()+" seconds:"+dctrim(elap_time-start_time)
return nil
func index2()
...the same
return nil
Let us know
Dan
use ... exclusiveAnyway I will try to adapt it to my needs.
Il 17/07/2021 15:50, CV ha scritto:doing ONLY the indexing, and nothing else
El sábado, 17 de julio de 2021 a la(s) 05:40:42 UTC-3, Ella Stern escribió:
Suggestion: explain to the admin that
- the indexing process needs a dedicated physical machine (NOT a VM) with a minimal Windows OS image (32 bits version) processor with many L1/L2 memory and high pace, 3 GB RAM, no Internet access, no end-user access, and your executable, which is
RAM and IO management.- before starting your executable, the necessary .DBF tables are copied onto that machine
- after the indexing completes successfully, the tables and indexes are picked up from that machine
Database server engines like Oracle and MS SQL are running on dedicated server with no connection to end-users, and each version is tied to specific OS versions and hardware, because they have some multi-threading features, which require advanced
are structural := cdx and dbf with same name).
Python and NodeJS are receiving user requests on different threads, but all those threads are using the RAM via time-sharing (one by one).
As I've mentioned, in case of indexing the critical resource is the RAM. >>
HTH
Ella, David
Thank you for your answers.
The computer acting as server runs windows server, is a really fast one (12 o 16 cores), the disk is a solid state drive and has 32 gb memory, so there are almost no hardware limits.
Thinking about the xharbour application is a 32 bit one, which uses no more than 50 Mb ram in the worst case so far.
The indexing process will run at night; this application is used 24 hs a day (with remote users at home, async data downloads, etc.), I don't want to run out of time when running this automated process, that is the reason of my message/need.
Should be one thread per database; as the database is used "exclusive" for indexing (I'm sure no other process will interfere), one index after the other, but many processes running in parallel recreating a set of indexes (many orders - all of them
Ella
Regards,
©
Thank you for your explanation.
No VM machines on that server, plenty of ram, enough speed in disk access... almost no limits in hardware.
The only limit is the available time frame to rebuild indexes in case it is needed.
How about a piece of code to do what I need to implement (or test)?
Is xharbour able to do that without errors?
Regards
Claudio Voskian
Try it yourself adapting this pseudocode:
#ifdef __XHARBOUR__
#xtranslate hb_threadStart( <x,...> ) => StartThread( <x> )
#endif
#include "hbthread.ch"
procedure main
...
// test monothread
start_time:=seconds()
? "start single thread "+ time()
index1()
index2()
elap_time=seconds()
? "End:"+time()+" seconds:"+dctrim(elap_time-start_time)
? "Start multithread:"+ time()
? "Start thread 1:"+ time()
hb_threadStart( HB_THREAD_INHERIT_PUBLIC , @index1() )
? "Start thread 2:"+ time()
hb_threadStart( HB_THREAD_INHERIT_PUBLIC, @index2() )
wait ""
return
func index1()
local start_time:=seconds(),elap_time
ferase index
use ... exclusive
index on...
use
elap_time=seconds()
? "End thread 1:"+time()+" seconds:"+dctrim(elap_time-start_time)
return nil
func index2()
...the same
return nil
Let us know
Dan
StartThread(@index1())
return
Il 19/07/2021 03:39, CV ha scritto:
StartThread(@index1())
return
So you start the thread and then exit. How can it work?
StartThread(@index1())
wait "Press a key"
return
Anyway, such a code is just for testing, eh.
Dan
El lunes, 19 de julio de 2021 a la(s) 09:46:27 UTC-3, Daniele escribió:
Il 19/07/2021 03:39, CV ha scritto:
StartThread(@index1())
return
So you start the thread and then exit. How can it work?
StartThread(@index1())
wait "Press a key"
return
Anyway, such a code is just for testing, eh.Dan
Dan
It was a copy and paste with missing lines, I have the wait "" before the end of the main routine, and there are 2 indexing functions for different databases (while I just copied one for the example, the other is identical).
When I start the 2 threads *sometimes* the error message occurs.
Other times just does nothing at all, I have to close the application with [X] upper right control.
Regards
Claudio Voskian
When I start the 2 threads *sometimes* the error message occurs.
Other times just does nothing at all, I have to close the application with [X] upper right control.
Regards
Claudio Voskian
Hi everyone, Dan specially
I don't know why, but the very same program that previously DOESN'T work, now works properly.
I didn't change a line, tried to test it yesterday and ... WORKS.
A mistery.
Thank you for the code!
Regards
Claudio Voskian
Il 21/07/2021 15:05, CV ha scritto:
When I start the 2 threads *sometimes* the error message occurs.
Other times just does nothing at all, I have to close the application with [X] upper right control.
Regards
Claudio Voskian
Hi everyone, Dan specially
I don't know why, but the very same program that previously DOESN'T work, now works properly.Well, I did not believe the two threads indexing the same file would
I didn't change a line, tried to test it yesterday and ... WORKS.
A mistery.
have succeeded. I was thinking of 2 different files!
I learned something. :-)
Thank you for the code!
Regards
Claudio Voskian
You are welcome.
Dan
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 296 |
Nodes: | 16 (2 / 14) |
Uptime: | 46:01:24 |
Calls: | 6,648 |
Files: | 12,198 |
Messages: | 5,329,850 |