• INN makehistory oddities

    From Jesse Rehmer@21:1/5 to All on Wed Oct 25 01:46:53 2023
    I performed manual manipulation of a spool by deleting lots of articles from tradspool directories and deleted a CNFS buffer containing binaries and junk). I removed the history files and overview directory contents, and have been running makehistory since 10/12/2023 and it is *still* running...

    The previous overview directory was 67GB and the new one is at 107GB, which seems odd because I did not add articles, and I know of no previous corruption or overview issues.

    While watching lsof output against the makehistory PID, I am finding that it
    is scanning the same tradspool folders more than once, but I am not sure if that is expected?

    I took a look at makehistory.c but do not understand the code enough to understand if this is normal or not. I would expect that once it has opened a tradspool directory that contains a group that it would scan all of the articles/files in that directory and move on, but it seems that is not the case?

    I also have a couple of "Bad article handle" messages that don't make sense:

    makehistory: tradspool: can't determine class of @0500000087B5000000000001390000000000@: Bad article handle

    It seems okay though?

    $ sm -c '@0500000087B5000000000001390000000000@' @0500000087B5000000000001390000000000@ method=tradspool class=0 ngnum=34741 artnum=0 file=/usr/local/news/spool/articles/misc/health/alternative/80128

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=C3=89LIE?=@21:1/5 to All on Sat Oct 28 22:52:57 2023
    Hi Jesse,

    I performed manual manipulation of a spool by deleting lots of articles from tradspool directories and deleted a CNFS buffer containing binaries and junk).
    I removed the history files and overview directory contents, and have been running makehistory since 10/12/2023 and it is *still* running...

    It takes a bloody long time... I unfortunately do not have any advice
    to make it run faster except for using the "-s" flag. I'll respond on
    another message you sent about makehistory and dbz operations.


    The previous overview directory was 67GB and the new one is at 107GB, which seems odd because I did not add articles

    Maybe the first run of expireover (news.daily) will shrink it a bit, and
    expire articles which should no longer be in the overview data
    (cancelled or removed by a NoCeM notice).


    While watching lsof output against the makehistory PID, I am finding that it is scanning the same tradspool folders more than once, but I am not sure if that is expected?

    When dealing with tradspool, it should indeed treat directories in order (tradspool_next() method). Maybe the behaviour you see comes from
    crossposted articles? makehistory opens the (hard-linked) file present
    in each directory an article has been crossposted to, and looks at its
    Xref header field to seek the first newsgroup mentioned (considered to
    be the master, and only treats the article when dealing with that first newsgroup).


    I also have a couple of "Bad article handle" messages that don't make sense:

    makehistory: tradspool: can't determine class of @0500000087B5000000000001390000000000@: Bad article handle

    I don't think the "Bad article handle" error is related to this article.
    This error is set when SMgetsub() in storage/interface.c is called with
    an article length of 0, whereas the "can't determine class of" error
    occurs only with articles whose length is > 0 in
    storage/tradspool/tradspool.c.

    So I bet the "Bad article handle" error corresponds to a previous error,
    and not the one for this article. Which implies that SMgetsub() did not
    return NULL and therefore that the type of the token does not correspond
    to tradspool.
    I am very unsure why it would happen during a rebuild of the history
    file. Looking at a comment, a modification of the storage.conf file is mentioned. Would it happen that the article stored in tradspool
    according to the rules of your initial storage.conf file has now a
    different rule?


    [storage/tradspool/tradspool.c]

    if ((sub = SMgetsub(*art)) == NULL || sub->type != TOKEN_TRADSPOOL) {
    /* maybe storage.conf is modified, after receiving article */
    token = MakeToken(priv.ngtp->ngname, artnum, 0);

    if (art->len > 0)
    warn("tradspool: can't determine class of %s: %s",
    TokenToText(token), SMerrorstr);
    }



    FWIW, I think the code could be improved this way so that it does not
    display an unappropriated error.

    - if (art->len > 0)
    + if (art->len > 0 && sub == NULL)
    warn("tradspool: can't determine class of %s: %s",
    TokenToText(token), SMerrorstr);



    It seems okay though?

    $ sm -c '@0500000087B5000000000001390000000000@' @0500000087B5000000000001390000000000@ method=tradspool class=0 ngnum=34741 artnum=0 file=/usr/local/news/spool/articles/misc/health/alternative/80128

    When the discussed error occurs, the token is computed on the fly with a
    forced class of 0, and belonging to tradspool. Decoding it with "sm -c"
    will naturally work (and give you the tradspool method, class 0).

    The good news is that the token is properly computed, and integrated
    into your history file.
    The error of "can't determine class of" should not be displayed in your
    case (if I am right in my understanding of what happened). It should be displayed only when there's a real error, that is to say when SMgetsub() returns NULL. Otherwise, I don't think it matters much.

    --
    Julien ÉLIE

    « Dans toute statistique, l'inexactitude du nombre est compensée par la
    précision des décimales. » (Alfred Sauvy)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jesse Rehmer@21:1/5 to iulius@nom-de-mon-site.com.invalid on Tue Oct 31 18:56:11 2023
    On Oct 28, 2023 at 3:52:57 PM CDT, "Julien ÉLIE" <iulius@nom-de-mon-site.com.invalid> wrote:

    When dealing with tradspool, it should indeed treat directories in order (tradspool_next() method). Maybe the behaviour you see comes from crossposted articles? makehistory opens the (hard-linked) file present
    in each directory an article has been crossposted to, and looks at its
    Xref header field to seek the first newsgroup mentioned (considered to
    be the master, and only treats the article when dealing with that first newsgroup).

    With lsof it is hard to catch individual open files/links, but what I am noticing is that it will open a directory, say /usr/local/news/spool/articles/uk/d-i-y, and appears to iterate through that directory building history and overview. Now that I'm several days into the process, I see that this morning it currently has /usr/local/news/spool/articles/uk/d-i-y open again and seems to be doing the same thing I noticed days ago. I assume it can't be duplicating history/overview records. There are some groups that have over a million articles, so those I notice it spending a lot of time keeping open and
    remember them. I didn't expect to see those same groups being opened later in the same run for long periods of time.

    I am very unsure why it would happen during a rebuild of the history
    file. Looking at a comment, a modification of the storage.conf file is mentioned. Would it happen that the article stored in tradspool
    according to the rules of your initial storage.conf file has now a
    different rule?

    Thanks for the explanations. It is possible for a small number of groups the storage class changed in the past.

    I now see a few of these errors:

    makehistory: cannot write overview data "@050000005106000000000000001B00000000@"
    makehistory: cannot write overview data "@050000005106000000000000001C00000000@"
    makehistory: cannot write overview data "@050000005106000000000000001D00000000@"
    makehistory: cannot write overview data "@050000005106000000000000002200000000@"
    makehistory: cannot write overview data "@050000005106000000000000002300000000@"

    I'm not really sure "why" it can't write the overview information as I don't see anything wrong with the articles and they are relatively recent.

    $ sm -c "@050000005106000000000000002300000000@" @050000005106000000000000002300000000@ method=tradspool class=0 ngnum=20742 artnum=0 file=/usr/local/news/spool/articles/de/alt/dateien/misc/35
    $ sm -c "@050000005106000000000000002200000000@" @050000005106000000000000002200000000@ method=tradspool class=0 ngnum=20742 artnum=0 file=/usr/local/news/spool/articles/de/alt/dateien/misc/34
    $ sm -c "@050000005106000000000000001B00000000@" @050000005106000000000000001B00000000@ method=tradspool class=0 ngnum=20742 artnum=0 file=/usr/local/news/spool/articles/de/alt/dateien/misc/27

    They seem like normal articles to me and nothing stands out at first glance:

    $ cat /usr/local/news/spool/articles/de/alt/dateien/misc/35
    Path: spool1.usenet.blueworldhosting.com!usenet.blueworldhosting.com!diablo1.usenet .blueworldhosting.com!eternal-september.org!news.eternal-september.org!.POSTE D!not-for-mail
    From: Marco Moock <mo01@posteo.de>
    Newsgroups: de.alt.dateien.misc
    Subject: Re: Wanze
    Date: Sun, 9 Jul 2023 19:03:18 +0200
    Organization: A noiseless patient Spider
    Lines: 10
    Message-ID: <u8ep8m$268n1$1@dont-email.me>
    References: <u8einn$25bdq$6@dont-email.me>
    <u8emh7$25vtg$1@dont-email.me>
    MIME-Version: 1.0
    Content-Type: text/plain; charset=UTF-8
    Content-Transfer-Encoding: quoted-printable
    Injection-Date: Sun, 9 Jul 2023 17:03:18 -0000 (UTC)
    Injection-Info: dont-email.me;
    posting-host="51a4c3270535c7dbb88d2b01f4ce400a";
    logging-data="2302689"; mail-complaints-to="abuse@eternal-september.org";
    posting-account="U2FsdGVkX1+JOa8skqOo+KzFXXog0DXO"
    Cancel-Lock: sha1:05wryGJ+LjAr7HuAwOtpktFjSLYXref: spool1.usenet.blueworldhosting.com de.alt.dateien.misc:35

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=C3=89LIE?=@21:1/5 to All on Wed Nov 1 09:09:36 2023
    Hi Jesse,

    With lsof it is hard to catch individual open files/links, but what I am noticing is that it will open a directory, say /usr/local/news/spool/articles/uk/d-i-y, and appears to iterate through that directory building history and overview. Now that I'm several days into the process, I see that this morning it currently has /usr/local/news/spool/articles/uk/d-i-y open again and seems to be doing the same thing I noticed days ago.

    Strange. I don't understand why makehistory would re-process the same newsgroup twice.
    Is it also listed several times in /usr/local/news/spool/tradspool.map?


    makehistory: cannot write overview data

    They seem like normal articles to me and nothing stands out at first glance:

    $ cat /usr/local/news/spool/articles/de/alt/dateien/misc/35

    The headers look normal. I also do not know what's happening :(

    Looking at the possible reasons for this error, I assume you're not
    using ovgrouppat (inn.conf) as it may cause that on some cases.
    Otherwise, I don't see why the write could not be done. Is this error
    only appearing now, after almost 20 days of makehistory run? (seems an eternity)

    --
    Julien ÉLIE

    « Non omnia possumus omnes. » (Virgile)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jesse Rehmer@21:1/5 to iulius@nom-de-mon-site.com.invalid on Wed Nov 1 12:30:17 2023
    On Nov 1, 2023 at 3:09:36 AM CDT, "Julien ÉLIE" <iulius@nom-de-mon-site.com.invalid> wrote:

    Hi Jesse,

    With lsof it is hard to catch individual open files/links, but what I am
    noticing is that it will open a directory, say
    /usr/local/news/spool/articles/uk/d-i-y, and appears to iterate through that >> directory building history and overview. Now that I'm several days into the >> process, I see that this morning it currently has
    /usr/local/news/spool/articles/uk/d-i-y open again and seems to be doing the >> same thing I noticed days ago.

    Strange. I don't understand why makehistory would re-process the same newsgroup twice.
    Is it also listed several times in /usr/local/news/spool/tradspool.map?

    They are only listed once.

    makehistory: cannot write overview data

    They seem like normal articles to me and nothing stands out at first glance: >>
    $ cat /usr/local/news/spool/articles/de/alt/dateien/misc/35

    The headers look normal. I also do not know what's happening :(

    Looking at the possible reasons for this error, I assume you're not
    using ovgrouppat (inn.conf) as it may cause that on some cases.
    Otherwise, I don't see why the write could not be done. Is this error
    only appearing now, after almost 20 days of makehistory run? (seems an eternity)

    That option is commented out in my config. I've had to run makehistory twice due to storage issues the first time. This second run has been going for a little over 7 days and I think is getting close to completing.

    I have about 20 of these errors now. They are all from the same group, de.alt.dateien.misc, and a few are of the same thread but otherwise I see nothing out of place.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=C3=89LIE?=@21:1/5 to All on Thu Nov 2 21:42:05 2023
    Hi Jesse,

    makehistory: cannot write overview data

    They seem like normal articles to me and nothing stands out at first glance:

    $ cat /usr/local/news/spool/articles/de/alt/dateien/misc/35

    I have about 20 of these errors now. They are all from the same group, de.alt.dateien.misc, and a few are of the same thread but otherwise I see nothing out of place.

    In order to investigate more, in case you could do the test, you may
    install a separate instance of INN, with paths in inn.conf naturally
    pointing to different locations than your production system, and
    patharticles pointing to a repository where you put only the spool of de.alt.dateien.misc, and you run "makehistory -O -x" to see whether you
    still see the "cannot write overview data" error. It would mean the
    error is reproducible.
    Then try "makehistory -S -O -x" (-S instructs makehistory to write to
    stdout the overview data). We'll maybe understand what's wrong in the
    overview data generated by makehistory.

    If you have a bit of time to do that, no obligation, and in the hope the
    issue is reproducible (it may not...).

    --
    Julien ÉLIE

    « Aliud est celare, aliud tacere. »

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)