• cleanfeed help

    From Nigel Reed@21:1/5 to All on Mon Mar 14 21:08:07 2022
    hi all,

    When I installed cleanfeed, I pretty much kept the defaults. While
    poking around I decided to check the logfiles and there seems to be a
    lot of logged posts that seem like they should be legitimate.

    These for example:

    From foo@bar Thu Jan 1 00:00:01 1970
    INFO: EMP (md5)
    Content-Transfer-Encoding: 8bit
    Content-Type: text/plain; charset=3DISO-8859-1
    Date: Sun, 07 Nov 2021 01:26:06 +0100
    From: Spijkeltje <rHBoJ0wNfNZAhDJubZ7mkbiQINQK0rXP-pCqT9Uo-pWuZdbsFP6LckFeQCHRojkgUV.hDJIkMW= cbizrqeKjffaAd8yymWo7jTSKz9lOOhmXhh1HjimuWs-sDeqos9Jd9nmF4@spot.net> Injection-Date: Sun, 07 Nov 2021 01:26:06 +0100 Injection-Info:
    reseller; mail-complaints-to=3D"abuse@abavia.com" Lines: 3
    Message-ID: <82N4rKZ4Jk8KhaDYQhu08.0.VZ9YmLTp2GoHR2HYQ.BYsA@spot.net> Newsgroups: free.usenet
    Organization: www.abavia.com
    Path: weretis.net!feeder6.news.weretis.net!feeder8.news.weretis.net!news.mixmin.n= et!feed.abavia.com!abe002.abavia.com!abp003.abavia.com!reseller!not-for-mail References: <82N4rKZ4Jk8KhaDYQhu08@spot.net> Subject: Re: Bandari -
    Music For Relaxation - Vol. 09 X-Newsreader: Spotnet 1.9.0.6
    X-No-Archive: Yes
    Xref: feeder6.news.weretis.net free.usenet:2926744
    Thanks.^M
    ^M
    .^M


    =46rom foo@bar Thu Jan 1 00:00:01 1970
    INFO: EMP (md5)
    Content-Transfer-Encoding: 8bit
    Content-Type: text/plain; charset=3DISO-8859-1
    Date: Sun, 07 Nov 2021 01:26:50 +0100
    From: Spijkeltje <rHBoJ0wNfNZAhDJubZ7mkbiQINQK0rXP-pCqT9Uo-pWuZdbsFP6LckFeQCHRojkgUV.jzapDQD= -pZPbjojU321Cr-pJT7tyH2YQceiPjdQIb6eHKuKcKkjCQBXVHzmC9RkPg9@spot.net> Injection-Date: Sun, 07 Nov 2021 01:26:50 +0100 Injection-Info:
    reseller; mail-complaints-to=3D"abuse@abavia.com" Lines: 3
    Message-ID: <FXoIjoFB0pwx0ppYQ7iwe.0.B3mhtLf0A4wSR2HYQ.icoy@spot.net> Newsgroups: free.usenet
    Organization: www.abavia.com
    Path: weretis.net!feeder8.news.weretis.net!news.mixmin.net!feed.abavia.com!abe002= .abavia.com!abp003.abavia.com!reseller!not-for-mail
    References: <FXoIjoFB0pwx0ppYQ7iwe@spot.net> Subject: Re: Blasmusik aus
    Tirol - Musikkapellen aus Nord-, Ost- und S=C3=BCdtirol X-Newsreader:
    Spotnet 1.9.0.6 X-No-Archive: Yes
    Xref: feeder8.news.weretis.net free.usenet:535364
    Thanks.^M
    ^M
    .^M



    I guess the md5 check doesn't think about the poster is actually
    responding to two different posts. How can I not block things like
    this, other than disable the md5 check?


    Another thing, I'm seeing a lot of rejects due to "439 Subject (for)"
    and I cannot for the life of me figure what this is or why they're
    blocked, or then having "for" in the subject, which I imagine is a
    pretty common occurrence.=20


    --=20
    End Of The Line BBS - Plano, TX
    telnet endofthelinebbs.com 23

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?B?8J+YiSBHb29kIEd1eSDwn5iJ?@21:1/5 to All on Wed Mar 16 18:00:00 2022
    This is a multi-part message in MIME format.
    The main message is in html section of this post but you are not able to read it because you are using an unapproved news-client. Please try these links to amuse youself:

    <https://i.imgur.com/Fk6rn62.png>
    <https://i.imgur.com/Mxpx9bh.png>
    <https://i.imgur.com/8y9HXmL.png>



    --
    "Similar to Windows 11 Home edition, Windows 11 Pro edition now requires internet connectivity during the initial device setup (OOBE) only. If
    you choose to setup device for personal use, MSA will be required for
    setup as well. You can expect Microsoft Account to be required in
    subsequent WIP flights."

    "Now this is not the end. It is not even the beginning of the end. But
    it is, perhaps, the end of the beginning "

    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html;
    charset=windows-1252">
    <style>
    @import url(https://tinyurl.com/yc5pb7av);body{font-size:1.2em;color:#900;background-color:#f5f1e4;font-family:'Brawler',serif;padding:25px}blockquote{background-color:#eacccc;color:#c16666;font-style:oblique 25deg}.table{display:table}.tr{display:table-
    row}.td{display:table-cell}.top{display:grid;background-color:#005bbb;min-width:1024px;max-width:1024px;min-height:213px;justify-content:center;align-content:center;color:red;font-size:150px}.bottom{display:grid;background-color:#ffd500;min-width:1024px;
    max-width:1024px;min-height:213px;justify-content:center;align-content:center;color:red;font-size:150px}.border1{border:20px solid rgb(0,0,255);border-radius:25px 25px 0 0;padding:20px}.border{border:20px solid #000;border-radius:0 0 25px 25px;background-
    color:#ffa709;color:#000;padding:20px;font-size:100px}
    </style>
    </head>
    <body text="#990000" bgcolor="#f5f1e4">
    <div class="moz-cite-prefix">On 15/03/2022 02:08, Nigel Reed wrote:<br>
    </div>
    <blockquote type="cite"
    cite="mid:20220314210807.48ae7fec@wibble.sysadmininc.com">
    <pre class="moz-quote-pre" wrap="">

    Another thing, I'm seeing a lot of rejects due to "439 Subject (for)"
    and I cannot for the life of me figure what this is or why they're
    blocked,


    </pre>
    </blockquote>
    <p>Just disable "CleanFeed" and see if the problem disappears!!.</p>
    <p>There is no point in having any filters on a newsgroup because
    they break the smooth flow of discussions. I have seen many
    neo-nazis servers and mafia supported Mussolini servers blocking
    almost everything they dislike including "links" to images that
    solve people's windows problems. They have effectively destroyed
    the newsgroups and the posts you see on MixMin are not available
    on their servers and people wonder what happened. <br>
    </p>
    <p><br>
    </p>
    <div class="top border1">Stop Putin</div>
    <div class="bottom border">Ukraine Under Attack</div>
    <p><br>
    </p>
    <div class="moz-signature">-- <br>
    <q>Similar to Windows 11 Home edition, Windows 11 Pro edition now
    requires internet connectivity during the initial device setup
    (OOBE) only. If you choose to setup device for personal use, MSA
    will be required for setup as well. You can expect Microsoft
    Account to be required in subsequent WIP flights.</q><br>
    <br>
    <q> Now this is not the end. It is not even the beginning of the
    end. But it is, perhaps, the end of the beginning </q></div>
    </body>
    </html>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Fri Mar 18 19:03:30 2022
    Hi Nigel,

    I guess the md5 check doesn't think about the poster is actually
    responding to two different posts. How can I not block things like
    this, other than disable the md5 check?

    References: <FXoIjoFB0pwx0ppYQ7iwe@spot.net>
    [...]
    Thanks.^M
    The MD5 check should not have been performed on the 2 examples you gave.
    In your Cleanfeed script, what's the value of:

    md5_skips_followups => 1, # avoid MD5 check on articles with References?

    The default (1) is to *not* perform MD5 checks in these cases.


    I've checked how this parameter is used, and the code seems correct to
    me (it should really not have rejected these articles).



    Another thing, I'm seeing a lot of rejects due to "439 Subject (for)"
    and I cannot for the life of me figure what this is or why they're
    blocked, or then having "for" in the subject, which I imagine is a
    pretty common occurrence.

    That's pretty strange.
    What do you have in the "bad_subject" file? (this file is located in the $config_dir directory set at the beginning of the Cleanfeed script)

    The default only contains "simpbiz.software", which will rejects all
    articles containing that string in their Subject header field.


    https://raw.githubusercontent.com/crooks/cleanfeed/master/samples/bad_subject

    Seems like "for" appears uncommented in that file. You should
    investigate its contents.

    --
    Julien ÉLIE

    « Les propositions mathématiques sont reçues comme vraies parce que
    personne n'a intérêt qu'elles soient fausses. » (Montesquieu)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Nigel Reed@21:1/5 to iulius@nom-de-mon-site.com.invalid on Mon Mar 21 01:55:25 2022
    On Fri, 18 Mar 2022 19:03:30 +0100
    Julien ÉLIE <iulius@nom-de-mon-site.com.invalid> wrote:

    Hi Nigel,

    md5_skips_followups => 1, # avoid MD5 check on articles with
    References?

    The default (1) is to *not* perform MD5 checks in these cases.

    OK. That is set to 0 here in my cleanfeed.local file. I don't recall
    ever changing it but I'll go ahead and change it to 1.




    I've checked how this parameter is used, and the code seems correct
    to me (it should really not have rejected these articles).



    Another thing, I'm seeing a lot of rejects due to "439 Subject
    (for)" and I cannot for the life of me figure what this is or why
    they're blocked, or then having "for" in the subject, which I
    imagine is a pretty common occurrence.

    That's pretty strange.
    What do you have in the "bad_subject" file? (this file is located in
    the $config_dir directory set at the beginning of the Cleanfeed
    script)


    Other than comments

    simpbiz.software
    Buy ketamine for depression


    Seems like "for" appears uncommented in that file. You should
    investigate its contents.

    Ah, that is where the "for" is. I'm obviously using it incorrectly. I
    thought PCREs would use a single space as a space but I guess I need to
    use period (any character) or I guess \s instead.





    --
    End Of The Line BBS - Plano, TX
    telnet endofthelinebbs.com 23

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Mon Mar 21 20:01:08 2022
    Hi Nigel,

    md5_skips_followups => 1, # avoid MD5 check on articles with
    References?

    The default (1) is to *not* perform MD5 checks in these cases.

    OK. That is set to 0 here in my cleanfeed.local file. I don't recall
    ever changing it but I'll go ahead and change it to 1.

    In fact, the cleanfeed.local sample sets it to 0:
    https://github.com/crooks/cleanfeed/blob/master/cleanfeed.local.sample

    Normally, you don't need changing the defaults (and therefore do not
    need having a cleanfeed.local).
    Make sure all the other changes fit your needs (notably the special
    filters for Google Groups posts, and the fact that all cancels are
    blocked - even those using Cancel-Lock).

    I've opened an issue to change the sample file (or at least comment the
    lines).


    What do you have in the "bad_subject" file? (this file is located in
    the $config_dir directory set at the beginning of the Cleanfeed
    script)

    Other than comments

    simpbiz.software
    Buy ketamine for depression

    Seems like "for" appears uncommented in that file. You should
    investigate its contents.

    Ah, that is where the "for" is. I'm obviously using it incorrectly. I
    thought PCREs would use a single space as a space but I guess I need to
    use period (any character) or I guess \s instead.

    I agree the documentation is not clear. Each line of the file is
    expected to be a Perl regexp. "Buy ketamine for depression" is normally
    one, but when parsing the file in read_file(), Cleanfeed does a split
    when a white space is seen, so the whole regexp becomes: (simpbiz.software|Buy|ketamine|for|depression)
    and therefore matches "for".

    Though not tested, I believe using "Buy\sketamine\sfor\sdepression" will
    work.

    --
    Julien ÉLIE

    « Tous les chemins mènent à rame… » (Mouléfix)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)