On Oct 27, 2023 at 10:05:42 PM CDT, "Ray Banana" <rayban@raybanana.net> wrote:
Hi,
the current flood of spam from Google Groups and the corresponding
amount of NoCeM messages made me consider an overview method that
does not rely on expireover to remove the overview data for cancelled
articles. From what I have seen on a test server, ovsqlite seems to
remove overview data immediately. I have rebuilt the ovdb of a test
server with ~1,5 million articles in just a couple of minutes and am now
considering a migration of my main reader server (~50 million articles)
to ovsqlite. Does anyone have experience and an estimate how long such a
migration might take on ordinary SATA disks with EXT4 filesystems?
Not quite the same use case, but I'm rebuilding history and moving to ovsqlite
on 1.5TB / 500 million articles on NVMe storage and it is taking over a week.
Hi,
the current flood of spam from Google Groups and the corresponding
amount of NoCeM messages made me consider an overview method that
does not rely on expireover to remove the overview data for cancelled articles. From what I have seen on a test server, ovsqlite seems to
remove overview data immediately. I have rebuilt the ovdb of a test
server with ~1,5 million articles in just a couple of minutes and am now considering a migration of my main reader server (~50 million articles)
to ovsqlite. Does anyone have experience and an estimate how long such a migration might take on ordinary SATA disks with EXT4 filesystems?
On Oct 28, 2023 at 3:32:52 AM CDT, "Jesse Rehmer" <jesse.rehmer@blueworldhosting.com> wrote:
On Oct 27, 2023 at 10:05:42 PM CDT, "Ray Banana" <rayban@raybanana.net> wrote:
Hi,
the current flood of spam from Google Groups and the corresponding
amount of NoCeM messages made me consider an overview method that
does not rely on expireover to remove the overview data for cancelled
articles. From what I have seen on a test server, ovsqlite seems to
remove overview data immediately. I have rebuilt the ovdb of a test
server with ~1,5 million articles in just a couple of minutes and am now >>> considering a migration of my main reader server (~50 million articles)
to ovsqlite. Does anyone have experience and an estimate how long such a >>> migration might take on ordinary SATA disks with EXT4 filesystems?
Not quite the same use case, but I'm rebuilding history and moving to ovsqlite
on 1.5TB / 500 million articles on NVMe storage and it is taking over a week.
I should add more flavor, initially I tried rebuilding history+overview as-is (tradspool + tradindexed) and ran into a filesystem issue (filled the ZFS pool
too full and performance suffered greatly). That attempt ran from October 14th
through the 24th before I ended it.
Added storage, rebalanced the ZFS pool, and decided to move to ovsqlite since others had commented it is faster to rebuild than tradindexed. That rebuild has been running since 10/23 and appears to be about half done based on the size of the new history file.
I've been able to rsync and ZFS send/receive the entire data set to other servers four times during this period, so this slow behavior isn't a limitation of my system or storage, jacking up the number of lines makehistory
processes does not seem to make any difference for me either.
If you aren't rebuilding the history file, you'll have a better experience. Spent some time watching system calls and filesystem activity, and the majority of what makehistory is spending time on in my case is dbz related stuff.
Hi Jesse,
If you aren't rebuilding the history file, you'll have a better experience. >> Spent some time watching system calls and filesystem activity, and the
majority of what makehistory is spending time on in my case is dbz related >> stuff.
Just to be sure, did you run makehistory with the -s flag to provide the estimated number of articles?
It reminds me the issue we recently discussed in this newsgroup about makedbz. I then reworded INSTALL this way:
"""
Next, you need to create an empty history database. To do this, type:
cd <pathdb in inn.conf>
touch history
makedbz -i -o
makedbz will then create a database optimized for handling about
6,000,000 articles (or 500,000 if the slower tagged hash format is
used). If you expect to inject more articles than that, use the "-s"
flag to specify the number of entries to size the initial history file
for. To pre-size it for 100,000,000 articles, type:
makedbz -i -o -s 100000000
This initial size does not limit the number of articles the news server
will accept. It will just get slower when that size is exceeded, until
the next run of news.daily which will appropriately resize it.
"""
I'm wondering whether you're not running into that issue with
makehistory. I should also update its manual page to emphasize the use
of the "-s" flag :)
I've had mixed results sizing the history file with makedbz or using the -s flag with makehistory. It seems no matter if appropriately sized or not, once the history file is around 10GB in size performance suffers dramatically.
Hi Jesse,
I've had mixed results sizing the history file with makedbz or using the -s >> flag with makehistory. It seems no matter if appropriately sized or not, once
the history file is around 10GB in size performance suffers dramatically.
Thanks for the feedback.
I unfortunately do not have in mind any other setting to test, nor the
time to audit and try to improve the performance of dbz. I guess the
best step would be to implement a second storage method for the history
file, for instance based on SQLite, but that's quite a work too...
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 300 |
Nodes: | 16 (2 / 14) |
Uptime: | 58:42:49 |
Calls: | 6,712 |
Files: | 12,243 |
Messages: | 5,355,631 |