• Using innfeed to send a batch instead of innxmit?

    From Jesse Rehmer@21:1/5 to All on Sat May 13 06:03:04 2023
    It seems like it should be possible to use the same batch file that you generate for innxmit (when feeding all articles to another server, for
    example) with innfeed, but I can't seem to get this to work as expected. I was hoping to leverage innfeed's multiple connections to get through this process faster. Streaming 290,000,000 articles via a single connection is s-l-o-w.

    I create a custom innfeed configuration file and specify it, which is a copy
    of stock innfeed.conf but change the PID file and add the remote peer. I place the batch file in /usr/local/news/spool/innfeed/peername and start innfeed using the custom configuration file. It seems to be reading through the entire batch file, but only offers a small number of articles to the remote server, and the input file is removed.

    I'm not sure why it isn't going through the entire batch, any ideas?

    This is what is logged:

    May 13 00:29:27 spool1 innfeed[28888]: ME starting at Sat May 13 00:29:27 2023 (INN 2.7.0)
    May 13 00:29:27 spool1 innfeed[28888]: loading /usr/local/news/etc/spoolfeed.conf
    May 13 00:29:30 spool1 innfeed[28888]: gatekeeper new hand-prepared backlog file
    May 13 00:29:30 spool1 innfeed[28888]: gatekeeper grabbing external tape file May 13 00:29:30 spool1 innfeed[28888]: gatekeeper:0 connected
    May 13 00:29:30 spool1 innfeed[28888]: gatekeeper remote MODE STREAM
    May 13 00:31:58 spool1 innfeed[15156]: ME time 600511 idle 600511(26) blstats 0(26) stsfile 0(0) write 0(0)
    May 13 00:36:28 spool1 innfeed[28888]: ME received shutdown signal
    May 13 00:36:28 spool1 innfeed[28888]: gatekeeper checkpoint seconds 418 offered 17 accepted 0 refused 17 rejected 0 missing 0 accsize 0 rejsize 0 spooled 0 on_close 0 unspooled 17 deferred 0/0.0 requeued 0 queue 0.0/10:100,0,0,0,0,0
    May 13 00:36:28 spool1 innfeed[28888]: gatekeeper final seconds 418 offered 17 accepted 0 refused 17 rejected 0 missing 0 accsize 0 rejsize 0 spooled 0 on_close 0 unspooled 17 deferred 0/0.0 requeued 0 queue 0.0/10:100,0,0,0,0,0 May 13 00:36:28 spool1 innfeed[28888]: gatekeeper:0 checkpoint seconds 418 offered 17 accepted 0 refused 17 rejected 0 accsize 0 rejsize 0
    May 13 00:36:28 spool1 innfeed[28888]: gatekeeper:0 final seconds 418 offered 17 accepted 0 refused 17 rejected 0 accsize 0 rejsize 0
    May 13 00:36:28 spool1 innfeed[28888]: gatekeeper global seconds 418 offered
    17 accepted 0 refused 17 rejected 0 missing 0 accsize 0 rejsize 0 spooled 0 unspooled 17
    May 13 00:36:28 spool1 innfeed[28888]: ME global seconds 421 offered 17 accepted 0 refused 17 rejected 0 missing 0 accsize 0 rejsize 0 spooled 0 unspooled 17
    May 13 00:36:28 spool1 innfeed[28888]: ME finishing at Sat May 13 00:36:28
    2023

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jesse Rehmer@21:1/5 to jesse.rehmer@blueworldhosting.com on Sat May 13 06:59:24 2023
    On May 13, 2023 at 1:03:04 AM CDT, "Jesse Rehmer" <jesse.rehmer@blueworldhosting.com> wrote:

    It seems like it should be possible to use the same batch file that you generate for innxmit (when feeding all articles to another server, for example) with innfeed, but I can't seem to get this to work as expected. I was
    hoping to leverage innfeed's multiple connections to get through this process faster. Streaming 290,000,000 articles via a single connection is s-l-o-w.

    I create a custom innfeed configuration file and specify it, which is a copy of stock innfeed.conf but change the PID file and add the remote peer. I place
    the batch file in /usr/local/news/spool/innfeed/peername and start innfeed using the custom configuration file. It seems to be reading through the entire
    batch file, but only offers a small number of articles to the remote server, and the input file is removed.

    I'm not sure why it isn't going through the entire batch, any ideas?

    This is what is logged:

    May 13 00:29:27 spool1 innfeed[28888]: ME starting at Sat May 13 00:29:27 2023
    (INN 2.7.0)
    May 13 00:29:27 spool1 innfeed[28888]: loading /usr/local/news/etc/spoolfeed.conf
    May 13 00:29:30 spool1 innfeed[28888]: gatekeeper new hand-prepared backlog file
    May 13 00:29:30 spool1 innfeed[28888]: gatekeeper grabbing external tape file May 13 00:29:30 spool1 innfeed[28888]: gatekeeper:0 connected
    May 13 00:29:30 spool1 innfeed[28888]: gatekeeper remote MODE STREAM
    May 13 00:31:58 spool1 innfeed[15156]: ME time 600511 idle 600511(26) blstats 0(26) stsfile 0(0) write 0(0)
    May 13 00:36:28 spool1 innfeed[28888]: ME received shutdown signal
    May 13 00:36:28 spool1 innfeed[28888]: gatekeeper checkpoint seconds 418 offered 17 accepted 0 refused 17 rejected 0 missing 0 accsize 0 rejsize 0 spooled 0 on_close 0 unspooled 17 deferred 0/0.0 requeued 0 queue 0.0/10:100,0,0,0,0,0
    May 13 00:36:28 spool1 innfeed[28888]: gatekeeper final seconds 418 offered 17
    accepted 0 refused 17 rejected 0 missing 0 accsize 0 rejsize 0 spooled 0 on_close 0 unspooled 17 deferred 0/0.0 requeued 0 queue 0.0/10:100,0,0,0,0,0 May 13 00:36:28 spool1 innfeed[28888]: gatekeeper:0 checkpoint seconds 418 offered 17 accepted 0 refused 17 rejected 0 accsize 0 rejsize 0
    May 13 00:36:28 spool1 innfeed[28888]: gatekeeper:0 final seconds 418 offered 17 accepted 0 refused 17 rejected 0 accsize 0 rejsize 0
    May 13 00:36:28 spool1 innfeed[28888]: gatekeeper global seconds 418 offered 17 accepted 0 refused 17 rejected 0 missing 0 accsize 0 rejsize 0 spooled 0 unspooled 17
    May 13 00:36:28 spool1 innfeed[28888]: ME global seconds 421 offered 17 accepted 0 refused 17 rejected 0 missing 0 accsize 0 rejsize 0 spooled 0 unspooled 17
    May 13 00:36:28 spool1 innfeed[28888]: ME finishing at Sat May 13 00:36:28 2023

    So I see that innfeed wants:

    @token@ <message-id>

    And the history file does not contain the Message-ID, and I am assuming when I cancelled innxmit and it rewrote the batch file that I used for innfeed, the first 17 lines have the Message-ID, but I believe this is coming from innxmit doing the lookup prior to sending and writing this out in the batch when it exits...

    How would I go about creating a batch in the format innfeed needs from my history file?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Sun May 14 21:44:42 2023
    Hi Jesse,
    I was
    hoping to leverage innfeed's multiple connections to get through this process
    faster. Streaming 290,000,000 articles via a single connection is s-l-o-w.

    Couldn't you run several innxmit instances in parallel?
    If you're worrying about a chronological feed, you may split the file containing your tokens to feed into several interleaved parts. For
    instance with 4 innxmit instances, one file with tokens 1, 5, 9, etc.
    another file with tokens 2, 6, 10, etc.
    We may assume they will be fed at a similar pace.


    So I see that innfeed wants:

    @token@ <message-id>

    How would I go about creating a batch in the format innfeed needs from my history file?

    Last time we spoke about that, running "sm -H '@token@'" to retrieve the Message-ID was not fast enough (especially if you have to run it on
    millions of articles). I don't know how it could be done otherwise in command-line.
    Otherwise, innfeed should be modified to lookup the Message-ID, when not
    given, in a similar way as innxmit does.

    As innfeed is called in a newsfeeds feed, and innd has the Message-ID at
    that time, it saves the lookup to give it along with the storage token.
    innfeed needs it because of the NNTP protocol (IHAVE <mid>). That's
    certainly why innfeed expects that format.

    --
    Julien ÉLIE

    « Il ne faut jamais parler sèchement à un Numide. » (Astérix)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jesse Rehmer@21:1/5 to iulius@nom-de-mon-site.com.invalid on Tue May 16 18:13:27 2023
    On May 14, 2023 at 2:44:42 PM CDT, "Julien ÉLIE" <iulius@nom-de-mon-site.com.invalid> wrote:

    Hi Jesse,

    Hey Julien!


    Couldn't you run several innxmit instances in parallel?
    If you're worrying about a chronological feed, you may split the file containing your tokens to feed into several interleaved parts. For
    instance with 4 innxmit instances, one file with tokens 1, 5, 9, etc.
    another file with tokens 2, 6, 10, etc.
    We may assume they will be fed at a similar pace.

    I am thinking of how to do this sanely, but two problems, the primary being
    I'm not sure how to generate the interleaved parts, shards, etc.

    The second is that innxmit dies sometimes at random, others for various
    reasons (server being fed closes the connection, etc.) and is difficult to monitor multiple instances and ensure they stay running to keep the chronology relatively sane.

    Last time we spoke about that, running "sm -H '@token@'" to retrieve the Message-ID was not fast enough (especially if you have to run it on
    millions of articles). I don't know how it could be done otherwise in command-line.
    Otherwise, innfeed should be modified to lookup the Message-ID, when not given, in a similar way as innxmit does.

    As innfeed is called in a newsfeeds feed, and innd has the Message-ID at
    that time, it saves the lookup to give it along with the storage token. innfeed needs it because of the NNTP protocol (IHAVE <mid>). That's certainly why innfeed expects that format.

    Sorry if we've discussed this and I forgot. Since I got COVID in 2021 my
    memory is like swiss cheese - full of holes. Everything prior to that seems relatively intact and easy to retrieve, but everything the past couple years
    is a mess. :)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Tue May 16 20:53:02 2023
    Hi Jesse,

    So I see that innfeed wants:

    @token@ <message-id>

    And the history file does not contain the Message-ID, and I am assuming when I
    cancelled innxmit and it rewrote the batch file that I used for innfeed, the first 17 lines have the Message-ID, but I believe this is coming from innxmit doing the lookup prior to sending and writing this out in the batch when it exits...

    How would I go about creating a batch in the format innfeed needs from my history file?

    Hmm, I'm wondering whether innfeed really makes use of the <message-id>
    innd gives, apart for the IHAVE <message-id> exchange with the remote peer. I've not tested but maybe you couldn't try a batch with:

    @token1@ <1@dumbid>
    @token2@ <2@dumbid>
    @token3@ <3@dumbid>

    where @tokenX@ are the tokens you want to send, followed with an
    arbitrary Message-ID (yet different for each token).
    Are the 3 articles fed OK?

    If the remote server is INN, it won't mind receiving an article with a Message-ID different than the one in the NNTP command used to send it.

    This should really be tried, because if it works I think your issue is
    almost solved :)

    --
    Julien ÉLIE

    « – Par Poséidon ! Quel prodige !!!
    – Par Neptune ! Quel sans-gêne ! » (Astérix)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to iulius@nom-de-mon-site.com.invalid on Tue May 16 12:22:58 2023
    Julien ÉLIE <iulius@nom-de-mon-site.com.invalid> writes:

    Hmm, I'm wondering whether innfeed really makes use of the <message-id>
    innd gives, apart for the IHAVE <message-id> exchange with the remote
    peer. I've not tested but maybe you couldn't try a batch with:

    @token1@ <1@dumbid>
    @token2@ <2@dumbid>
    @token3@ <3@dumbid>

    where @tokenX@ are the tokens you want to send, followed with an
    arbitrary Message-ID (yet different for each token). Are the 3 articles
    fed OK?

    I haven't checked the source code, but I would expect innfeed to use the message ID in the CHECK command to avoid reading the whole article from
    disk in the (very common) case that the remote server declines the CHECK.

    If the remote server is INN, it won't mind receiving an article with a Message-ID different than the one in the NNTP command used to send it.

    This remains true, of course. You'll put a bunch of bogus message IDs in
    the remote server's conflict cache, so it's not exactly the friendliest
    thing to do, but for cooperating servers it might work.

    That said, the message ID is also given on the TAKETHIS command line, and
    I'm not sure if innfeed gets that from the article or the input batch. If
    from the input batch, a server would be entirely within its rights to
    reject a message where the message ID on the TAKETHIS command line was different than the message ID in the article. (I'm not sure if any do.)

    --
    Russ Allbery (eagle@eyrie.org) <https://www.eyrie.org/~eagle/>

    Please post questions rather than mailing me directly.
    <https://www.eyrie.org/~eagle/faqs/questions.html> explains why.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Tue May 16 22:12:37 2023
    Hi Russ,

    Hmm, I'm wondering whether innfeed really makes use of the <message-id>
    innd gives, apart for the IHAVE <message-id> exchange with the remote
    peer. I've not tested but maybe you couldn't try a batch with:

    @token1@ <1@dumbid>
    @token2@ <2@dumbid>
    @token3@ <3@dumbid>

    where @tokenX@ are the tokens you want to send, followed with an
    arbitrary Message-ID (yet different for each token). Are the 3 articles
    fed OK?

    I haven't checked the source code, but I would expect innfeed to use the message ID in the CHECK command to avoid reading the whole article from
    disk in the (very common) case that the remote server declines the CHECK.

    I've just quickly had a look, and I do not see any parsing of headers.
    For CHECK, IHAVE and TAKETHIS, the Message-ID used is taken from artMsgId(article) where article->msgid has been set with newArticle()
    using the input from innd.

    So this kludge may really work for Jesse. I look forward to reading
    your results!



    If the remote server is INN, it won't mind receiving an article with a
    Message-ID different than the one in the NNTP command used to send it.

    This remains true, of course. You'll put a bunch of bogus message IDs in
    the remote server's conflict cache, so it's not exactly the friendliest
    thing to do, but for cooperating servers it might work.

    Yes. I would naturally not recommend to do that in other use cases than
    the one Jesse submitted here.



    That said, the message ID is also given on the TAKETHIS command line, and
    I'm not sure if innfeed gets that from the article or the input batch.

    It seems not.



    If from the input batch, a server would be entirely within its rights
    to reject a message where the message ID on the TAKETHIS command line
    was different than the message ID in the article. (I'm not sure if
    any do.)
    Yes, it would totally be within its rights.

    FWIW, RFC 4644 (streaming extension) explicitly says that the first
    parameter of the responses to CHECK and TAKETHIS *MUST* be the
    message-id provided by the client as the parameter to these commands.
    So for TAKETHIS, even if another Message-ID is found, this is not the
    one to use in the answer. It will permit the sending site correctly
    wiping the bogus IDs sent.

    --
    Julien ÉLIE

    « Être en vacances, c'est n'avoir rien à faire et avoir toute la journée
    pour le faire. » (Robert Orben)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jesse Rehmer@21:1/5 to iulius@nom-de-mon-site.com.invalid on Tue May 16 22:05:02 2023
    On May 16, 2023 at 3:12:37 PM CDT, "Julien ÉLIE" <iulius@nom-de-mon-site.com.invalid> wrote:

    Hi Russ,

    Hmm, I'm wondering whether innfeed really makes use of the <message-id>
    innd gives, apart for the IHAVE <message-id> exchange with the remote
    peer. I've not tested but maybe you couldn't try a batch with:

    @token1@ <1@dumbid>
    @token2@ <2@dumbid>
    @token3@ <3@dumbid>

    where @tokenX@ are the tokens you want to send, followed with an
    arbitrary Message-ID (yet different for each token). Are the 3 articles >>> fed OK?

    I haven't checked the source code, but I would expect innfeed to use the
    message ID in the CHECK command to avoid reading the whole article from
    disk in the (very common) case that the remote server declines the CHECK.

    I've just quickly had a look, and I do not see any parsing of headers.
    For CHECK, IHAVE and TAKETHIS, the Message-ID used is taken from artMsgId(article) where article->msgid has been set with newArticle()
    using the input from innd.

    So this kludge may really work for Jesse. I look forward to reading
    your results!

    Indeed it does with a small test:

    Batch file:

    @05000000940900000000001C496C00000000@ <1@dumbid>
    @0500000021AE0000000000020A2100000000@ <2@dumbid>
    @0500000002D8000000000006802500000000@ <3@dumbid>

    Messages accepted on the remote server with actual Message-IDs:

    May 16 17:03:08.665 + news.blueworldhosting.com <u3n1li$20fql$3@dont-email.me> 3641
    May 16 17:03:08.787 + news.blueworldhosting.com <%RD7M.3041094$vBI8.1207177@fx15.iad> 2474
    May 16 17:03:08.791 + news.blueworldhosting.com <ac766a8d-a700-4b72-80d4-6fd11376fe9dn@googlegroups.com> 6717

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jesse Rehmer@21:1/5 to jesse.rehmer@blueworldhosting.com on Wed May 17 03:21:33 2023
    On May 16, 2023 at 5:05:02 PM CDT, "Jesse Rehmer" <jesse.rehmer@blueworldhosting.com> wrote:

    On May 16, 2023 at 3:12:37 PM CDT, "Julien ÉLIE" <iulius@nom-de-mon-site.com.invalid> wrote:

    Hi Russ,

    Hmm, I'm wondering whether innfeed really makes use of the <message-id> >>>> innd gives, apart for the IHAVE <message-id> exchange with the remote
    peer. I've not tested but maybe you couldn't try a batch with:

    @token1@ <1@dumbid>
    @token2@ <2@dumbid>
    @token3@ <3@dumbid>

    where @tokenX@ are the tokens you want to send, followed with an
    arbitrary Message-ID (yet different for each token). Are the 3 articles >>>> fed OK?

    I haven't checked the source code, but I would expect innfeed to use the >>> message ID in the CHECK command to avoid reading the whole article from
    disk in the (very common) case that the remote server declines the CHECK. >>
    I've just quickly had a look, and I do not see any parsing of headers.
    For CHECK, IHAVE and TAKETHIS, the Message-ID used is taken from
    artMsgId(article) where article->msgid has been set with newArticle()
    using the input from innd.

    So this kludge may really work for Jesse. I look forward to reading
    your results!

    Indeed it does with a small test:

    Batch file:

    @05000000940900000000001C496C00000000@ <1@dumbid>
    @0500000021AE0000000000020A2100000000@ <2@dumbid>
    @0500000002D8000000000006802500000000@ <3@dumbid>

    Messages accepted on the remote server with actual Message-IDs:

    May 16 17:03:08.665 + news.blueworldhosting.com <u3n1li$20fql$3@dont-email.me> 3641
    May 16 17:03:08.787 + news.blueworldhosting.com <%RD7M.3041094$vBI8.1207177@fx15.iad> 2474
    May 16 17:03:08.791 + news.blueworldhosting.com <ac766a8d-a700-4b72-80d4-6fd11376fe9dn@googlegroups.com> 6717

    I ended up using this to add bogus Message-IDs to the batch file:

    awk '{print $0" <" NR "@1>"}' original-batch > innfeed-batch

    This seems to be working great, moving around 95,000 articles every ten
    minutes according to innfeed logs, but I'm seeing some low occurrence of the following message in /var/log/news/news on the destination server:

    (null) 439 Bad "Message-ID" header field

    There are 101 occurrences of this error out of a few hundred thousand articles fed. Should I be concerned about that, and is there any way for me to find out which articles are problematic?

    Source server is INN 2.7.0 and destination is 2.7.1. Besides difference in overview method, the servers have a pretty identical configuration.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Wed May 17 20:03:39 2023
    Hi Jesse,

    Messages accepted on the remote server with actual Message-IDs

    That's a good news :)


    This seems to be working great, moving around 95,000 articles every ten minutes according to innfeed logs, but I'm seeing some low occurrence of the following message in /var/log/news/news on the destination server:

    (null) 439 Bad "Message-ID" header field

    There are 101 occurrences of this error out of a few hundred thousand articles
    fed. Should I be concerned about that, and is there any way for me to find out
    which articles are problematic?

    Source server is INN 2.7.0 and destination is 2.7.1. Besides difference in overview method, the servers have a pretty identical configuration.

    Strange.
    It would be worthwhile investigating on a few Message-IDs.

    If your remote peer has the following logs:

    <mid1> accepted
    (null) 439 Bad "Message-ID" header field
    <mid3> accepted

    You may try to run on your feeding peer:

    grephistory '<mid1>'

    It will give the storage token of that article.
    Then look at your innfeed-batch file, and try to retrieve some tokens
    after it with an "sm '@token@'" command. One of them should be the
    article with <mid2>. (And <mid3> afterwards.)

    Do you see anything special with that article?

    Were they first received with INN 2.7.0 or a previous version?

    --
    Julien ÉLIE

    « Le véritable voyage de découverte ne consiste pas à chercher de
    nouveaux paysages, mais à avoir de nouveaux yeux. » (Marcel Proust)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jesse Rehmer@21:1/5 to iulius@nom-de-mon-site.com.invalid on Thu May 18 18:34:20 2023
    On May 17, 2023 at 1:03:39 PM CDT, "Julien ÉLIE" <iulius@nom-de-mon-site.com.invalid> wrote:

    Hi Jesse,

    Messages accepted on the remote server with actual Message-IDs

    That's a good news :)


    This seems to be working great, moving around 95,000 articles every ten
    minutes according to innfeed logs, but I'm seeing some low occurrence of the >> following message in /var/log/news/news on the destination server:

    (null) 439 Bad "Message-ID" header field

    There are 101 occurrences of this error out of a few hundred thousand articles
    fed. Should I be concerned about that, and is there any way for me to find out
    which articles are problematic?

    Source server is INN 2.7.0 and destination is 2.7.1. Besides difference in >> overview method, the servers have a pretty identical configuration.

    Strange.
    It would be worthwhile investigating on a few Message-IDs.

    If your remote peer has the following logs:

    <mid1> accepted
    (null) 439 Bad "Message-ID" header field
    <mid3> accepted

    You may try to run on your feeding peer:

    grephistory '<mid1>'

    It will give the storage token of that article.
    Then look at your innfeed-batch file, and try to retrieve some tokens
    after it with an "sm '@token@'" command. One of them should be the
    article with <mid2>. (And <mid3> afterwards.)

    Do you see anything special with that article?

    Were they first received with INN 2.7.0 or a previous version?

    Now I have a repeatable process, I'm going to clean up the target host and run a fresh batch and will take time to dig into this more. I'm not sure if the ordering of articles received will be exact since I'm using 20 connections against the batch?

    Good news though, things are really moving now:

    May 18 13:14:02 spool1 innfeed[83665]: gatekeeper checkpoint seconds 601 offered 220218 accepted 220223 refused 0 rejected 0 missing 0 accsize
    564395089 rejsize 0 spooled 0 on_close 0 unspooled 220223 deferred 0/0.0 requeued 0 queue 0.0/10:100,0,0,0,0,0

    I do have a question though, I see some entries are being made to /usr/local/news/spool/innfeed/batch.output. Innfeed's manpage makes a short reference to this file in that it is where entries go that could not be processed for some reason, but it does not explain what innfeed does with this .output file. Does it ever get reprocessed automatically?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Thu May 18 21:55:01 2023
    Hi Jesse,

    Now I have a repeatable process

    That's great!


    I'm not sure if the ordering of articles received will be exact since
    I'm using 20 connections against the batch?

    The ordering will not be exact as nothing guarantees that the target
    host accepts the articles in the same order they have began to be fed.
    Articles will roughly be in order, though.


    I do have a question though, I see some entries are being made to /usr/local/news/spool/innfeed/batch.output. Innfeed's manpage makes a short reference to this file in that it is where entries go that could not be processed for some reason, but it does not explain what innfeed does with this
    .output file. Does it ever get reprocessed automatically?

    Yes, they are reprocessed automatically. This is parameterized with the backlog* keys in innfeed.conf.
    Many thanks for this question! I agree the innfeed(8) manual page should
    say they are processed. We are under the impression they are not, with
    the current wording. I'll add a sentence about that.

    FYI, every backlog-rotate-period seconds, something like that happens:

    if [ ! -f PEER.input ]; then
    if [ -f PEER ]; then
    mv PEER PEER.input
    elif [ -f PEER.output ]; then
    mv PEER.output PEER
    fi
    fi

    --
    Julien ÉLIE

    « Traversez la rivière en foule, le crocodile ne vous mangera pas. »
    (proverbe malgache)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Sun Jul 9 09:45:10 2023
    Fixing my previous message:

    I do have a question though, I see some entries are being made to
    /usr/local/news/spool/innfeed/batch.output. Innfeed's manpage makes a
    short
    reference to this file in that it is where entries go that could not be
    processed for some reason, but it does not explain what innfeed does
    with this
    .output file. Does it ever get reprocessed automatically?

    Yes, they are reprocessed automatically.  This is parameterized with the backlog* keys in innfeed.conf.
    Many thanks for this question! I agree the innfeed(8) manual page should
    say they are processed.  We are under the impression they are not, with
    the current wording.  I'll add a sentence about that.

    FYI, every backlog-rotate-period seconds, something like that happens:

      if [ ! -f PEER.input ]; then
        if [ -f PEER ]; then
          mv PEER PEER.input
        elif [ -f PEER.output ]; then
          mv PEER.output PEER
        fi
      fi

    The second move should have been read:
    mv PEER.output PEER.input


    I've proof-read what innfeed does and updated the documentation.
    Is it now clear enough or should something still be clarified?


    [innfeed]
    The configuration file is described in innfeed.conf(5). The -c option
    can be used to specify a different file, and -b to specify a different
    backlog directory. The backlog-* keys in the configuration file
    parameterize the behaviour of backlogging. For each peer (say, "foo"),
    innfeed manages up to 4 files in the backlog directory:

    • A foo.lock file, which prevents other instances of innfeed from
    interfering with this one.

    • A foo.input file which has old article information innfeed is reading
    for re-processing.

    • A foo.output file where innfeed is writing information on articles
    that could not be processed (normally due to a slow or blocked peer).
    Every backlog-rotate-period seconds, innfeed checks whether it is not
    empty, and, if no foo file exists and foo.input is empty, will then
    rename foo.output to foo.input and start reading from it.

    • A foo file that is never created by innfeed, but if innfeed notices
    it when checking every backlog-newfile-period seconds, it will rename
    it to foo.input at the next opportunity (every backlog-rotate-period
    seconds if foo.input is empty) and will start reading from it. This
    lets you create a batch file and put it in a place where innfeed will
    find it.


    [innfeed.conf]
    backlog-highwater
    This key requires a positive integer value and defaults to 5. It
    specifies how many articles should be kept on the backlog file
    queue before starting to write new entries to disk.

    backlog-ckpt-period
    This key requires a positive integer value and defaults to 30. It
    specifies how many seconds elapse between checkpoints and rewrites
    of the input backlog file. Too small a number will mean frequent
    disk accesses; too large a number will mean after a crash, innfeed
    will re-offer more already-processed articles than necessary.

    backlog-newfile-period
    This key requires a positive integer value and defaults to 600. It
    specifies how many seconds elapse before each check for externally
    generated backlog files that are to be picked up and processed.

    backlog-rotate-period
    This key requires a positive integer value and defaults to 60. It
    specifies how many seconds elapse before innfeed moves an
    externally generated backlog file to the input backlog file (if
    backlog-newfile-period seconds have elapsed) or in the absence of
    such a file, moves the output backlog file to the input backlog
    file. No moves occur if the input backlog file is not empty.



    --
    Julien ÉLIE

    « Vous savez, les idées, elles sont dans l'air. Il suffit que quelqu'un
    vous en parle de trop près, pour que vous les attrapiez ! » (Raymond
    Devos)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)