• [PATCH]: SLRN: include Injection headers in new score.

    From Kaz Kylheku@21:1/5 to All on Mon Aug 28 19:57:35 2023
    This patch affects editing of the score file. Usenet suffers from a lot
    of Google Groups spam, and it is often identified well through the
    Injection*: headers. These headers are not propagated to the score
    file, so I've had to manually copy and paste, which is annoying.

    This change makes slrn scan through the list of additional headers and
    pick out anything that starts with "Injection" or contains
    "Posting-Host". (Note that I'm not seeing the NNTP-Posting-Host header
    show up in the list, but if that is fixed, it will be included.)

    I sent the patch upstream to J.E.D.

    From c84453f2def8f78721d4f31166c1e75568a2fd90 Mon Sep 17 00:00:00 2001
    From: Kaz Kylheku <kaz@kylheku.com>
    Date: Mon, 28 Aug 2023 00:01:05 -0700
    Subject: [PATCH 2/2] Include Injection headers when editing scores.

    Google Groups posts contain headers that are useful in
    filtering, identifying the account and IP address.
    SLRN should propagate these to the score file to make
    it easier to make use of them.

    * src/editscore.c (slrn_edit_score):Walk the additional
    headers in h->add_hdrs. If the name of any header starts
    with "Inject" or contain "Posting-Host", then include it.
    ---
    src/editscore.c | 43 +++++++++++++++++++++++++++++++++++++++++++
    1 file changed, 43 insertions(+)

    diff --git a/src/editscore.c b/src/editscore.c
    index 5c7d2df..ba1d022 100644
    --- a/src/editscore.c
    +++ b/src/editscore.c
    @@ -186,6 +186,8 @@ int slrn_edit_score (Slrn_Header_Type *h, char *newsgroup)

    if (Slrn_Prefix_Arg_Ptr == NULL)
    {
    + Slrn_Header_Line_Type *ah_iter;
    +
    char *line;
    int comment;
    linenum = 1;
    @@ -292,6 +294,47 @@ int slrn_edit_score (Slrn_Header_Type *h, char *newsgroup)
    fprintf (fp, "%%\tNewsgroup: %s\n", q);
    }

    +
    + for (ah_iter = h->add_hdrs; ah_iter != NULL; ah_iter = ah_iter->next)
    + {
    + char *value = ah_iter->value;
    + char *name = ah_iter->name;
    +
    + if (name == NULL || value == NULL)
    + continue;
    +
    + if (strncmp(name, "Inject", 6) != 0 &&
    + strstr(name, "Posting-Host") == 0)
    + continue;
    +
    + line = slrn_safe_strmalloc (value);
    + remove_linebreaks (line);
    + if ((NULL == (q = SLregexp_qu
  • From Matthew Ernisse@21:1/5 to Kaz Kylheku on Thu Aug 31 21:51:17 2023
    On Mon, 28 Aug 2023 19:57:35 -0000 (UTC), Kaz Kylheku wrote:
    This patch affects editing of the score file. Usenet suffers from a lot
    of Google Groups spam, and it is often identified well through the Injection*: headers. These headers are not propagated to the score
    file, so I've had to manually copy and paste, which is annoying.

    Is there a reason you don't just score based on Message-ID? Google
    Groups messages are farily easily identifiable that way. I have the
    below in my News/Scores file to mark all Google Group originated
    articles.

    [*]
    %% Demote all googlegroup origin articles.
    Score: -20
    Message-ID: googlegroups

    I mention this because I don't think Injection or Posting-Host are
    generally included in the overview which means slrn will have to fetch
    more information to apply scoring -- potentially needlessly.

    ( eternal-september lists the following for overview headers )
    ---
    LIST OVERVIEW.FMT
    215 Order of fields in overview database
    Subject:
    From:
    Date:
    Message-ID:
    References:
    Bytes:
    Lines:
    Xref:full
    ---

    --
    "The avalanche has started, it is too late for the pebbles to vote."
    --Kosh

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Matthew Ernisse on Thu Aug 31 23:52:16 2023
    On 2023-08-31, Matthew Ernisse <matt@going-flying.com> wrote:
    On Mon, 28 Aug 2023 19:57:35 -0000 (UTC), Kaz Kylheku wrote:
    This patch affects editing of the score file. Usenet suffers from a lot
    of Google Groups spam, and it is often identified well through the
    Injection*: headers. These headers are not propagated to the score
    file, so I've had to manually copy and paste, which is annoying.

    Is there a reason you don't just score based on Message-ID? Google
    Groups messages are farily easily identifiable that way.

    Well, yes, if I wanted to nuke everything coming from Google Groups,
    I know which iron to pull out from the golf bag.

    There are sometimes normal people coming from GG whose only
    fault is that they don't know about using a proper newsreader
    connected to an NNTP server.

    I have the
    below in my News/Scores file to mark all Google Group originated
    articles.

    [*]
    %% Demote all googlegroup origin articles.
    Score: -20
    Message-ID: googlegroups

    I never use a score other than -9999, using .slrn-scores purely
    as a kill file.

    I can't think of a situation in which I would want multiple
    conditions to be met in order for a post to disappear.

    Every single rule I put in the file is a clear "deal breaker"
    considered in isolation.

    I mention this because I don't think Injection or Posting-Host are
    generally included in the overview which means slrn will have to fetch
    more information to apply scoring -- potentially needlessly.

    Correct. SLRN fetches extra headers to apply my rules. For the volume,
    it's nothing. Probably less than what a Google Groups user fetches
    via their browser just to load a page.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Kaz Kylheku on Thu Sep 7 07:01:26 2023
    On 2023-08-28, Kaz Kylheku <864-117-4973@kylheku.com> wrote:
    From c84453f2def8f78721d4f31166c1e75568a2fd90 Mon Sep 17 00:00:00 2001
    From: Kaz Kylheku <kaz@kylheku.com>
    Date: Mon, 28 Aug 2023 00:01:05 -0700
    Subject: [PATCH 2/2] Include Injection headers when editing scores.

    This is has an issue, requiring a follow-up patch

    From 51444fba3b157cebcf2296833a52be26e2e5b745 Mon Sep 17 00:00:00 2001
    From: Kaz Kylheku <kaz@kylheku.com>
    Date: Wed, 6 Sep 2023 20:32:04 -0700
    Subject: [PATCH 3/3] bug: editscore: extra headers react to 'f'

    I copy-pasted some code in coming up with the feature
    whereby additional header are included in a score file
    edit. If the user selects to create the score by F)rom,
    these headers are wrongly being uncommented.
    I think that extra headers should always be commented.

    * src/editscore.c (slrn_edit_score): In the loop that
    checks the list of additional headers, every header
    that we select is now added to the file commented
    with the % character, unconditionally. This simplifies
    the code; the copy pasted (ch == 'f') and (ch != 'f')
    are gone, as is the regex compilation check, and
    setting of re_error.
    ---
    src/editscore.c | 28 +++++-----------------------
    1 file changed, 5 insertions(+), 23 deletions(-)

    diff --git a/src/editscore.c b/src/editscore.c
    index ba1d022..f0d5218 100644
    --- a/src/editscore.c
    +++ b/src/editscore.c
    @@ -309,30 +309,12 @@ int slrn_edit_score (Slrn_Header_Type *h, char *newsgroup)

    line = slrn_safe_strmalloc (value);
    remove_linebreaks (line);
    - if ((NULL == (q = SLregexp_quote_string (line, qregexp, sizeof (qregexp)))) ||
    - (ch != 'f') ||
    -#if SLANG_VERSION < 20000
    - (0 != SLang_regexp_compile (&re))
    -#else
    - (NULL == (re = SLregexp_compile (qregexp, SLREGEXP_CASELESS))) -#endif
    - )
    - {
    - re_error |= (ch == 'f');
    - comment = 1;
    - }
    - else
    - comment = 0;
    - fprintf (fp, "%c\t%s: %s\n",
    - (comment ? '%' : ' '), ah_iter->name, (q ? q : line));
    + /* Extra headers are always commented, so we don't do the regex
    + * compilation check or react to selection letter in ch.
    + */