• inn: both Perl and Python filtering at the same time

    From Adam W.@21:1/5 to All on Mon Oct 23 23:41:42 2023
    Hi.

    I have cleanfeed running, and I'm generally happy with it.

    I'd like to add another rule to reject some articles (fed via newsfeeds,
    not posted by posters), but I don't know Perl well enough to write it. I
    know Python though.

    Is it possible to run both Perl (cleanfeed) and Python (my small script,
    yet to be written) filters together? What's the rule then? Both filters
    have to accept the article for it to be accepted by the server (so if at
    least one rejects it, it will be rejected)?

    I'd expect so, but I want to be sure before I start messing around...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=C3=89LIE?=@21:1/5 to All on Tue Oct 24 12:31:01 2023
    Hi Adam,

    Is it possible to run both Perl (cleanfeed) and Python (my small script,
    yet to be written) filters together? What's the rule then? Both filters
    have to accept the article for it to be accepted by the server (so if at least one rejects it, it will be rejected)?

    Yes you can run both filters. They both have to accept the article.
    The Python script (filter_innd.py) is run, then the Perl one
    (filter_innd.pl, which corresponds to cleanfeed) .

    % ctlinnd python y
    ctlinnd: Python filter already enabled
    % ctlinnd perl y
    ctlinnd: Perl filter already enabled

    to ensure they are enabled.

    --
    Julien ÉLIE

    « C'est la goutte qui fait déborder l'amphore ! » (Assurancetourix)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Adam W.@21:1/5 to iulius@nom-de-mon-site.com.invalid on Tue Oct 24 11:27:24 2023
    Julien LIE <iulius@nom-de-mon-site.com.invalid> wrote:

    Yes you can run both filters. They both have to accept the article.
    The Python script (filter_innd.py) is run, then the Perl one
    (filter_innd.pl, which corresponds to cleanfeed) .

    One more thing. Is it possible to redirect an article using a Python
    filter? I'd want to edit some headers:

    - move Newsgroups to X-Original-Newsgroups
    - create new Newsgroups (to direct post to another group)
    - maybe modify Subject (to add original group(s) name to it)

    And then accept the article.

    The goal is to have a local hierarchy where all these spams are posted, so
    they can be reviewed (by me and by my users, if they want to).

    If it's not possible, then at least I'd like to save the article to a file before rejecting it (and post-process and repost it later, in the same way
    I did with NoCeM-cancelled posts). Is it possible from within a filter?

    Thanks!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Adam W.@21:1/5 to iulius@nom-de-mon-site.com.invalid on Tue Oct 24 11:11:05 2023
    Julien LIE <iulius@nom-de-mon-site.com.invalid> wrote:

    Yes you can run both filters. They both have to accept the article.
    The Python script (filter_innd.py) is run, then the Perl one
    (filter_innd.pl, which corresponds to cleanfeed) .

    Thanks Julien!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ray Banana@21:1/5 to All on Tue Oct 24 15:00:54 2023
    Thus spake gof-cut-this-news@cut-this-chmurka.net.invalid (Adam W.)



    - move Newsgroups to X-Original-Newsgroups
    - create new Newsgroups (to direct post to another group)
    - maybe modify Subject (to add original group(s) name to it)

    You should not modify an article in filter_innd.pl
    Every time you modify a header in the filter, an innocent kitten dies
    somewhere ;-)

    And then accept the article.
    The goal is to have a local hierarchy where all these spams are posted, so they can be reviewed (by me and by my users, if they want to).

    If it's not possible, then at least I'd like to save the article to a file before rejecting it (and post-process and repost it later, in the same way
    I did with NoCeM-cancelled posts). Is it possible from within a filter?

    In general, don't do too much processing inside the filter, as it may bring
    the INN daemon to a screeching halt, when the spam flood gets really
    heavy.

    Been there, done that, got little sleep.

    --
    Пу́тін — хуйло́
    http://www.eternal-september.org

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Adam W. on Tue Oct 24 08:33:42 2023
    gof-cut-this-news@cut-this-chmurka.net.invalid (Adam W.) writes:

    One more thing. Is it possible to redirect an article using a Python
    filter? I'd want to edit some headers:

    - move Newsgroups to X-Original-Newsgroups
    - create new Newsgroups (to direct post to another group)
    - maybe modify Subject (to add original group(s) name to it)

    And then accept the article.

    The goal is to have a local hierarchy where all these spams are posted, so they can be reviewed (by me and by my users, if they want to).

    INN already has some built-in support for filing articles into the special newsgroup junk regardless of what newsgroup it was actually posted to. Currently, filters don't have any way of telling innd to file the filtered article into junk, but that might be a useful feature to add.

    That doesn't get you everything else you want, though, and it would put
    them all in a single group, so it may not solve your problem (in which
    case it might just be a feature without a use case).

    We won't ever want to support modifying the headers of the article itself, since the risk is too high that we'll accidentally propagate the modified article back out on Usenet and make a mess.

    --
    Russ Allbery (eagle@eyrie.org) <https://www.eyrie.org/~eagle/>

    Please post questions rather than mailing me directly.
    <https://www.eyrie.org/~eagle/faqs/questions.html> explains why.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=C3=89LIE?=@21:1/5 to As Ray Banana on Tue Oct 24 17:18:33 2023
    Hi Adam,

    One more thing. Is it possible to redirect an article using a Python
    filter? I'd want to edit some headers:

    - move Newsgroups to X-Original-Newsgroups
    - create new Newsgroups (to direct post to another group)
    - maybe modify Subject (to add original group(s) name to it)

    And then accept the article.

    As Ray Banana said, that's not possible.


    If it's not possible, then at least I'd like to save the article to a file before rejecting it (and post-process and repost it later, in the same way
    I did with NoCeM-cancelled posts). Is it possible from within a filter?

    Yes, just before rejecting the article, just open a file and write (or
    append) the article.

    https://www.eyrie.org/~eagle/software/inn/docs/hook-python.html

    Take all the header fields which are not empty from the art dictionary,
    and then write art['__BODY__'] as the body.

    Also, as said, it may eat CPU cycles and slow innd down. Worth trying
    though, depending on how many articles you receive per second and how
    fast your system is.

    --
    Julien ÉLIE

    « Que l'on serve notre boisson nationale, le calva, dans des crânes de
    vaincus ! » (Astérix)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Adam W.@21:1/5 to Ray Banana on Thu Oct 26 02:04:03 2023
    Ray Banana <rayban@raybanana.net> wrote:

    You should not modify an article in filter_innd.pl
    Every time you modify a header in the filter, an innocent kitten dies somewhere ;-)

    Oh no, save the kittens! :)

    Thanks for your response (and all other responses). I wrote the filter and
    I'm running it on the full feed. It saves rejected articles to files that
    are later processed and end up being posted to chmurka.spam.*.thai (* is
    the affected hierarchy). These groups are available on news.chmurka.net
    without authentication (but chmurka.spam.* is stored in a 1 GiB CNFS, so
    the retention might be short).

    Here's the script (but I might change it later if something doesn't work
    as it should). Suggestions are welcome:

    http://www.chmurka.net/r/usenet/filter_innd.py.txt

    Let's see how it works... if it does, I might as well start posting
    NoCeMs with these articles.

    In general, don't do too much processing inside the filter, as it may bring the INN daemon to a screeching halt, when the spam flood gets really
    heavy.

    Creating a file and writing to it counts as "too much", I guess... but
    let's see.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)