• Getting spamassassin and clamav as inn filters

    From The Doctor@21:1/5 to All on Mon Oct 9 14:57:15 2023
    Any recipes how?
    --
    Member - Liberal International This is doctor@nk.ca Ici doctor@nk.ca
    Yahweh, King & country!Never Satan President Republic!Beware AntiChrist rising! Look at Psalms 14 and 53 on Atheism https://www.empire.kred/ROOTNK?t=94a1f39b An oil stain on the carpet is not removed by picking up the litter. -unknown Beware https://mindspring.com

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Gea-Suan Lin@21:1/5 to The Doctor on Thu Oct 12 04:26:16 2023
    On 2023-10-09, The Doctor <doctor@doctor.nl2k.ab.ca> wrote:
    Any recipes how?

    Yeah, I just implemented a simple hack within `cleanfeed.local`. Have
    tried, but not so useful. Still many spam into comp.lang.c and other
    groups.

    The most efficient way to avoid Google Groups spam for now is just
    giving up anything from Google Groups.

    ```
    use Mail::SpamAssassin;

    my $sa_agent = Mail::SpamAssassin->new();

    sub local_filter_last {
    return unless $hdr{Path} =~ /google-groups\.googlegroups\.com/;

    my %myhdr = %hdr;
    delete $myhdr{__BODY__};
    delete $myhdr{__LINES__};

    my $header_str = join "\n", map { "$_: $hdr{$_}" } keys %myhdr;
    my $article_str = "$header_str\n\n$hdr{__BODY__}";

    my $mail = $sa_agent->parse($article_str);
    my $status = $sa_agent->check($mail);

    return reject("Reject Google Groups posting to $hdr{Newsgroups} by SpamAssassin") if $status->is_spam();

    $status->finish();
    $mail->finish();

    return;
    }
    ```

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ray Banana@21:1/5 to All on Thu Oct 12 05:44:28 2023
    * Gea-Suan Lin wrote:
    On 2023-10-09, The Doctor <doctor@doctor.nl2k.ab.ca> wrote:
    Any recipes how?

    Yeah, I just implemented a simple hack within `cleanfeed.local`. Have
    tried, but not so useful. Still many spam into comp.lang.c and other
    groups.
    [...]
    use Mail::SpamAssassin;

    my $sa_agent = Mail::SpamAssassin->new();

    sub local_filter_last {
    return unless $hdr{Path} =~ /google-groups\.googlegroups\.com/;

    my %myhdr = %hdr;
    delete $myhdr{__BODY__};
    delete $myhdr{__LINES__};

    my $header_str = join "\n", map { "$_: $hdr{$_}" } keys %myhdr;
    my $article_str = "$header_str\n\n$hdr{__BODY__}";

    my $mail = $sa_agent->parse($article_str);
    my $status = $sa_agent->check($mail);

    return reject("Reject Google Groups posting to $hdr{Newsgroups} by SpamAssassin") if $status->is_spam();

    $status->finish();
    $mail->finish();

    return;
    }
    ```

    OK, now you need a ~/.spamassassin directory for your news user and a user_prefs
    file in that directory. After that you can start adding rules for Usenet spam. You will also need to feed several hundreds of spam and ham articles to sa-learn --spam
    or sa-learn --ham as the news user. After that, SpamAssassin will gradually improve.

    --
    Пу́тін — хуйло́
    http://www.eternal-september.org

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=C3=89LIE?=@21:1/5 to All on Thu Oct 12 08:58:19 2023
    Hi Gea-Suan Lin,

    Any recipes how?

    Yeah, I just implemented a simple hack within `cleanfeed.local`. Have
    tried, but not so useful. Still many spam into comp.lang.c and other
    groups.

    FWIW, there's a doc in French to set up a "spamchk" funnel to
    SpamAssassin in the newsfeeds file:

    https://web.archive.org/web/20230901182332/https://git.alphanet.ch/gitweb/?p=inn-install;a=blob_plain;f=README.html;hb=HEAD#filtrer-le-spam-avec-spamassassin

    --
    Julien ÉLIE

    « Medicus curat, natura sanat. »

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Gea-Suan Lin@21:1/5 to Ray Banana on Thu Oct 12 09:00:09 2023
    Thanks for the information.

    I added a setting into ~/.spamassassin/user_prefs for recognizing MIME
    part:

    #
    bayes_token_sources all

    Then I manually selected 200+ hams and 200+ spams from comp.lang.c, and
    50+ spams from comp.lang.python as well as 200+ spams from sci.crypt. Afterwards I sent all these hams/spams into sa-learn.

    The result looks pretty good so far. Almost all new spams into
    comp.lang.c were blocked by SpamAssassin.

    I put my trained files here, so you may just reuse it:

    https://newsfeed.hasname.com/files/usenet-spamassassin-20231012.tar.gz

    Ray Banana <rayban@raybanana.net> wrote:
    * Gea-Suan Lin wrote:
    On 2023-10-09, The Doctor <doctor@doctor.nl2k.ab.ca> wrote:
    Any recipes how?

    Yeah, I just implemented a simple hack within `cleanfeed.local`. Have
    tried, but not so useful. Still many spam into comp.lang.c and other
    groups.
    [...]
    use Mail::SpamAssassin;

    my $sa_agent = Mail::SpamAssassin->new();

    sub local_filter_last {
    return unless $hdr{Path} =~ /google-groups\.googlegroups\.com/;

    my %myhdr = %hdr;
    delete $myhdr{__BODY__};
    delete $myhdr{__LINES__};

    my $header_str = join "\n", map { "$_: $hdr{$_}" } keys %myhdr;
    my $article_str = "$header_str\n\n$hdr{__BODY__}";

    my $mail = $sa_agent->parse($article_str);
    my $status = $sa_agent->check($mail);

    return reject("Reject Google Groups posting to $hdr{Newsgroups} by SpamAssassin") if $status->is_spam();

    $status->finish();
    $mail->finish();

    return;
    }
    ```

    OK, now you need a ~/.spamassassin directory for your news user and a user_prefs
    file in that directory. After that you can start adding rules for Usenet spam.
    You will also need to feed several hundreds of spam and ham articles to sa-learn --spam
    or sa-learn --ham as the news user. After that, SpamAssassin will gradually improve.


    --
    Resistance is futile.
    https://blog.gslin.org/ & <gslin@gslin.org>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From yamo'@21:1/5 to All on Sun Jan 28 10:05:49 2024
    Hi Julien,


    Julien ÉLIE a tapoté le 12/10/2023 08:58:
    Hi Gea-Suan Lin,

    Any recipes how?

    Yeah, I just implemented a simple hack within `cleanfeed.local`. Have
    tried, but not so useful. Still many spam into comp.lang.c and other
    groups.

    FWIW, there's a doc in French to set up a "spamchk" funnel to
    SpamAssassin in the newsfeeds file:

    https://web.archive.org/web/20230901182332/https://git.alphanet.ch/gitweb/?p=inn-install;a=blob_plain;f=README.html;hb=HEAD#filtrer-le-spam-avec-spamassassin


    The spamchk funnel is slower than calling SpamAssassin in cleanfeed.local. After some tests, I've adopted the technique from Gea-Suan Lin, it could
    be found here :
    <http://al.howardknight.net/?STYPE=msgid&MSGI=%3Cug7sh8%24pcc%241%40colo-sc-1.gslin.com%3E>


    I will update the French documentation :
    <https://git.mcos.nc/INN/inn_install>

    --
    Stéphane
    UTILISATEURS de GOOGLE GROUPS, vous n'aurez bientôt plus accès à Usenet. <https://support.google.com/groups/answer/11036538>
    Des serveurs gratuits de remplacement : <http://usenet-fr.yakakwatik.org>
    Des logiciels : <http://usenet-fr.yakakwatik.org/lecteurs-de-news.html>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ray Banana@21:1/5 to All on Sun Jan 28 11:22:51 2024
    Thus spake yamo' <yamo@beurdin.invalid>

    [...]
    After some tests, I've adopted the technique from Gea-Suan Lin, it could
    be found here :
    <http://al.howardknight.net/?STYPE=msgid&MSGI=%3Cug7sh8%24pcc%241%40colo-sc-1.gslin.com%3E>

    For performance reasons, especially if you receive a full text feed, I
    would recommend to use spamd instead of starting spamassassin for every article:

    my %myhdr = %hdr;
    delete $myhdr{__BODY__};
    delete $myhdr{__LINES__};
    my $header_str = join "\n", map { "$_: $hdr{$_}" } keys %myhdr;
    my $article_str = "$header_str\n\n$hdr{__BODY__}";
    my $spamtest = Mail::SpamAssassin::Client->new({
    port => /spamd port/,
    host => /spamd host/,
    username => 'news'}); # Use ~news/.spamassassin/user_prefs

    my $result = $spamtest->process($article_str);
    $score = $result->{score};

    INN::syslog('notice', $hdr{'Message-ID'} . " Score: $score, isspam: " . $result->{isspam} );
    if ($result->{isspam} =~ 'True') {
    [...] # local proceessing, nocemize etc.
    return 'SPAM';

    } else {
    [...] # local processing
    }



    --
    Пу́тін — хуйло́
    https://www.eternal-september.org

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From yamo'@21:1/5 to All on Sun Jan 28 19:58:20 2024
    Hi Ray,

    Ray Banana a tapoté le 28/01/2024 11:22:
    Thus spake yamo' <yamo@beurdin.invalid>

    [...]
    After some tests, I've adopted the technique from Gea-Suan Lin, it could
    be found here :
    <http://al.howardknight.net/?STYPE=msgid&MSGI=%3Cug7sh8%24pcc%241%40colo-sc-1.gslin.com%3E>

    For performance reasons, especially if you receive a full text feed, I
    would recommend to use spamd instead of starting spamassassin for every article:


    Thanks!

    It works but I have to test a little more.


    --
    Stéphane
    UTILISATEURS de GOOGLE GROUPS, vous n'aurez bientôt plus accès à Usenet. <https://support.google.com/groups/answer/11036538>
    Des serveurs gratuits de remplacement : <http://usenet-fr.yakakwatik.org>
    Des logiciels : <http://usenet-fr.yakakwatik.org/lecteurs-de-news.html>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)