• No colon-space in header field check in INN

    From Jesse Rehmer@21:1/5 to All on Wed Feb 1 06:19:34 2023
    I've been on auto-pilot for a bit, but started checking logs/reports today and notice an increasing number of articles rejected due to the following:

    439 No colon-space in "User-Agent:" header field

    When reviewing headers of the those messages there is a User-Agent: header,
    but its blank and does not contain a space.

    It is probably against a RFC to accept these messages, but I'm not concerned about the User-Agent header. I went looking for a way to relax the check, but is strict for all headers in innd/art.c:

    /* Find first colon */
    if ((colon = memchr(header, ':', size)) == NULL || !ISWHITE(colon[1])) {
    if ((p = memchr(header, '\r', size)) != NULL)
    *p = '\0';
    snprintf(cp->Error, sizeof(cp->Error),
    "%d No colon-space in \"%s\" header field",
    ihave ? NNTP_FAIL_IHAVE_REJECT : NNTP_FAIL_TAKETHIS_REJECT,
    MaxLength(header, header));
    if (p != NULL)
    *p = '\r';
    return;
    }

    I know there are potential implications for accepting messages with empty headers, but do we need to treat all headers so strictly?

    Regards,

    Jesse Rehmer

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Jesse Rehmer on Wed Feb 1 08:00:45 2023
    Jesse Rehmer <jesse.rehmer@blueworldhosting.com> writes:

    It is probably against a RFC to accept these messages, but I'm not
    concerned about the User-Agent header. I went looking for a way to relax
    the check, but is strict for all headers in innd/art.c:

    /* Find first colon */
    if ((colon = memchr(header, ':', size)) == NULL || !ISWHITE(colon[1])) {
    if ((p = memchr(header, '\r', size)) != NULL)
    *p = '\0';
    snprintf(cp->Error, sizeof(cp->Error),
    "%d No colon-space in \"%s\" header field",
    ihave ? NNTP_FAIL_IHAVE_REJECT : NNTP_FAIL_TAKETHIS_REJECT,
    MaxLength(header, header));
    if (p != NULL)
    *p = '\r';
    return;
    }

    I know there are potential implications for accepting messages with
    empty headers, but do we need to treat all headers so strictly?

    We do need to be pretty strict, since otherwise you can get into really
    nasty situations with ambiguous parses where two servers may disagree
    about something fundamental like the message ID of the article.

    I think if someone wanted to send a patch that added the ability to accept articles with headers that (a) weren't part of the protocol (so not
    Message-ID, Path, etc.), (b) specifically ended in a colon and a newline
    and no other variation, and (c) did not have a continuation, and it was configurable with the syntaxchecks setting and was off by default, that
    would be worth considering. I wouldn't want to relax the checks any
    farther than that. (It may be a bit difficult to wedge that into innd's
    code, though, since the above error happens at a fairly low level of the parse.)

    --
    Russ Allbery (eagle@eyrie.org) <https://www.eyrie.org/~eagle/>

    Please post questions rather than mailing me directly.
    <https://www.eyrie.org/~eagle/faqs/questions.html> explains why.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Nigel Reed@21:1/5 to Russ Allbery on Wed Feb 1 12:03:50 2023
    On Wed, 01 Feb 2023 08:00:45 -0800
    Russ Allbery <eagle@eyrie.org> wrote:

    Jesse Rehmer <jesse.rehmer@blueworldhosting.com> writes:

    It is probably against a RFC to accept these messages, but I'm not concerned about the User-Agent header. I went looking for a way to
    relax the check, but is strict for all headers in innd/art.c:

    /* Find first colon */
    if ((colon = memchr(header, ':', size)) == NULL ||
    !ISWHITE(colon[1])) { if ((p = memchr(header, '\r', size)) != NULL)
    *p = '\0';
    snprintf(cp->Error, sizeof(cp->Error),
    "%d No colon-space in \"%s\" header field",
    ihave ? NNTP_FAIL_IHAVE_REJECT : NNTP_FAIL_TAKETHIS_REJECT, MaxLength(header, header));
    if (p != NULL)
    *p = '\r';
    return;
    }

    I know there are potential implications for accepting messages with
    empty headers, but do we need to treat all headers so strictly?

    We do need to be pretty strict, since otherwise you can get into
    really nasty situations with ambiguous parses where two servers may
    disagree about something fundamental like the message ID of the
    article.

    I think if someone wanted to send a patch that added the ability to
    accept articles with headers that (a) weren't part of the protocol
    (so not Message-ID, Path, etc.), (b) specifically ended in a colon
    and a newline and no other variation, and (c) did not have a
    continuation, and it was configurable with the syntaxchecks setting
    and was off by default, that would be worth considering. I wouldn't
    want to relax the checks any farther than that. (It may be a bit
    difficult to wedge that into innd's code, though, since the above
    error happens at a fairly low level of the parse.)

    I wouldn't be happy about relaxing the checks or having a server feed
    in that does. They're there for a reason. Most legitimate users are
    going to follow the RFC. Spammers and script kiddies are less likely to
    adhere to the requirements, which is where we get them.

    The proper course of action is to identify if the messages are
    legitimate and then refer the poster to their software's author to
    correct the issue.

    Once you start making exceptions for one, then you're going to be
    making exceptions for others and, as you say, this could lead to a
    nasty mess. Let's please not fix innd which isn't broken to fix a
    client that is.




    --
    End Of The Line BBS - Plano, TX
    telnet endofthelinebbs.com 23

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jesse Rehmer@21:1/5 to All on Wed Feb 1 21:47:20 2023
    On Feb 1, 2023 at 12:03:50 PM CST, "Nigel Reed" <sysop@endofthelinebbs.com> wrote:

    On Wed, 01 Feb 2023 08:00:45 -0800
    Russ Allbery <eagle@eyrie.org> wrote:

    Jesse Rehmer <jesse.rehmer@blueworldhosting.com> writes:

    It is probably against a RFC to accept these messages, but I'm not
    concerned about the User-Agent header. I went looking for a way to
    relax the check, but is strict for all headers in innd/art.c:

    /* Find first colon */
    if ((colon = memchr(header, ':', size)) == NULL ||
    !ISWHITE(colon[1])) { if ((p = memchr(header, '\r', size)) != NULL)
    *p = '\0';
    snprintf(cp->Error, sizeof(cp->Error),
    "%d No colon-space in \"%s\" header field",
    ihave ? NNTP_FAIL_IHAVE_REJECT :
    NNTP_FAIL_TAKETHIS_REJECT, MaxLength(header, header));
    if (p != NULL)
    *p = '\r';
    return;
    }

    I know there are potential implications for accepting messages with
    empty headers, but do we need to treat all headers so strictly?

    We do need to be pretty strict, since otherwise you can get into
    really nasty situations with ambiguous parses where two servers may
    disagree about something fundamental like the message ID of the
    article.

    I think if someone wanted to send a patch that added the ability to
    accept articles with headers that (a) weren't part of the protocol
    (so not Message-ID, Path, etc.), (b) specifically ended in a colon
    and a newline and no other variation, and (c) did not have a
    continuation, and it was configurable with the syntaxchecks setting
    and was off by default, that would be worth considering. I wouldn't
    want to relax the checks any farther than that. (It may be a bit
    difficult to wedge that into innd's code, though, since the above
    error happens at a fairly low level of the parse.)

    I wouldn't be happy about relaxing the checks or having a server feed
    in that does. They're there for a reason. Most legitimate users are
    going to follow the RFC. Spammers and script kiddies are less likely to adhere to the requirements, which is where we get them.

    The proper course of action is to identify if the messages are
    legitimate and then refer the poster to their software's author to
    correct the issue.

    Once you start making exceptions for one, then you're going to be
    making exceptions for others and, as you say, this could lead to a
    nasty mess. Let's please not fix innd which isn't broken to fix a
    client that is.

    They are legimately posted articles in my eyes, but maybe not others.
    Switching to Diablo may be best to acommodate my unique needs. INN with cleanfeed or pyclean does not scale well with my traffic, so this is another push to the dark side.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Fri Feb 3 21:03:14 2023
    Hi Jesse,

    I've been on auto-pilot for a bit, but started checking logs/reports today and
    notice an increasing number of articles rejected due to the following:

    439 No colon-space in "User-Agent:" header field

    When reviewing headers of the those messages there is a User-Agent: header, but its blank and does not contain a space.

    It is probably against a RFC to accept these messages

    Indeed, they shouldn't have been accepted by injecting agents.
    Some posting agents send empty header fields, which are usually stripped
    off by injecting agents before injecting the article.



    but I'm not concerned
    about the User-Agent header. I went looking for a way to relax the check, but is strict for all headers in innd/art.c:

    /* Find first colon */
    if ((colon = memchr(header, ':', size)) == NULL || !ISWHITE(colon[1])) {
    if ((p = memchr(header, '\r', size)) != NULL)
    *p = '\0';
    snprintf(cp->Error, sizeof(cp->Error),
    "%d No colon-space in \"%s\" header field",
    ihave ? NNTP_FAIL_IHAVE_REJECT : NNTP_FAIL_TAKETHIS_REJECT,
    MaxLength(header, header));
    if (p != NULL)
    *p = '\r';
    return;
    }

    If you're looking for a quick-and-dirty hack for that very header field,
    I think the following check would do the trick (not tested):

    @@ -644,10 +644,12 @@ ARTcheckheader(CHANNEL *cp, int size)
    if ((colon = memchr(header, ':', size)) == NULL ||
    !ISWHITE(colon[1])) {
    if ((p = memchr(header, '\r', size)) != NULL)
    *p = '\0';
    + if (strcasecmp(header, "User-Agent:") != 0 || colon[1] != '\0') {
    snprintf(cp->Error, sizeof(cp->Error),
    "%d No colon-space in \"%s\" header field",
    ihave ? NNTP_FAIL_IHAVE_REJECT :
    NNTP_FAIL_TAKETHIS_REJECT,
    MaxLength(header, header));
    + }
    if (p != NULL)
    *p = '\r';
    return;

    --
    Julien ÉLIE

    « – Ce n'était pas ma question.
    – C'était p'têt pas vot'question, oui, mais c'est ma réponse ! »
    (Georges Marchais répondant à Alain Duhamel)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jesse Rehmer@21:1/5 to iulius@nom-de-mon-site.com.invalid on Fri Feb 3 22:23:52 2023
    On Feb 3, 2023 at 2:03:14 PM CST, "Julien ÉLIE" <iulius@nom-de-mon-site.com.invalid> wrote:

    Hi Jesse,

    If you're looking for a quick-and-dirty hack for that very header field,
    I think the following check would do the trick (not tested):

    @@ -644,10 +644,12 @@ ARTcheckheader(CHANNEL *cp, int size)
    if ((colon = memchr(header, ':', size)) == NULL ||
    !ISWHITE(colon[1])) {
    if ((p = memchr(header, '\r', size)) != NULL)
    *p = '\0';
    + if (strcasecmp(header, "User-Agent:") != 0 || colon[1] != '\0') {
    snprintf(cp->Error, sizeof(cp->Error),
    "%d No colon-space in \"%s\" header field",
    ihave ? NNTP_FAIL_IHAVE_REJECT : NNTP_FAIL_TAKETHIS_REJECT,
    MaxLength(header, header));
    + }
    if (p != NULL)
    *p = '\r';
    return;

    Thank you, Julien. After examining a larger sampling of these articles, they appear to be noise and I'm not concerned about dropping them.

    I will keep this code stashed away though, it may come in handy in the future as lazy developers write more client software that isn't compliant.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Grant Taylor@21:1/5 to Nigel Reed on Wed Feb 1 11:56:03 2023
    On 2/1/23 11:03 AM, Nigel Reed wrote:
    I wouldn't be happy about relaxing the checks or having a server feed
    in that does. They're there for a reason. Most legitimate users are
    going to follow the RFC. Spammers and script kiddies are less likely
    to adhere to the requirements, which is where we get them.

    The proper course of action is to identify if the messages are
    legitimate and then refer the poster to their software's author to
    correct the issue.

    Once you start making exceptions for one, then you're going to be
    making exceptions for others and, as you say, this could lead to a
    nasty mess. Let's please not fix innd which isn't broken to fix a
    client that is.

    +10 to everything that Nigel said.



    --
    Grant. . . .
    unix || die

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Grant Taylor@21:1/5 to Russ Allbery on Wed Feb 1 10:43:24 2023
    On 2/1/23 9:00 AM, Russ Allbery wrote:
    We do need to be pretty strict, since otherwise you can get into really
    nasty situations with ambiguous parses where two servers may disagree
    about something fundamental like the message ID of the article.

    Agreed.

    I think if someone wanted to send a patch that added the ability to
    accept articles with headers that (a) weren't part of the protocol
    (so not Message-ID, Path, etc.), (b) specifically ended in a colon and
    a newline and no other variation, and (c) did not have a continuation,
    and it was configurable with the syntaxchecks setting and was off by
    default, that would be worth considering.

    I think my biggest hang up is the idea that the header is defined
    (present) but doesn't have a value. -- My personal opinion is that if
    a header is there, then there should be /something/ in it.

    I don't know if a single space following the colon is as important to
    me. I'd be willing to accept -- what SMTP RFCs define as -- CFWS
    sequences between the delimiting colon and header contents.

    I wouldn't want to relax the checks any farther than that. (It may
    be a bit difficult to wedge that into innd's code, though, since the
    above error happens at a fairly low level of the parse.)

    Other than possibly allowing CFWS, I think that the headers should all
    be well formed.




    --
    Grant. . . .
    unix || die

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From D Finnigan@21:1/5 to Jesse Rehmer on Tue Feb 7 15:10:41 2023
    Jesse Rehmer <jesse.rehmer@blueworldhosting.com> wrote:
    I've been on auto-pilot for a bit, but started checking logs/reports today and
    notice an increasing number of articles rejected due to the following:

    439 No colon-space in "User-Agent:" header field

    When reviewing headers of the those messages there is a User-Agent: header, but its blank and does not contain a space.

    It is probably against a RFC to accept these messages,

    The colon is merely a separator between field name and field value. The
    space afterward is only for human readability and is not mandated by the
    RFCs. The whitespace following a colon is part of the field value and is typically trimmed out.

    Check the RFCs; people assume the space following a colon is mandatory, but it's not.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Julian Bradfield@21:1/5 to D Finnigan on Tue Feb 7 16:05:40 2023
    On 2023-02-07, D Finnigan <dog_cow@macgui.com> wrote:
    Jesse Rehmer <jesse.rehmer@blueworldhosting.com> wrote:
    I've been on auto-pilot for a bit, but started checking logs/reports today and
    notice an increasing number of articles rejected due to the following:

    439 No colon-space in "User-Agent:" header field

    When reviewing headers of the those messages there is a User-Agent: header, >> but its blank and does not contain a space.

    It is probably against a RFC to accept these messages,

    The colon is merely a separator between field name and field value. The
    space afterward is only for human readability and is not mandated by the RFCs. The whitespace following a colon is part of the field value and is typically trimmed out.

    Check the RFCs; people assume the space following a colon is mandatory, but it's not.

    I think you should check the RFCs before telling people off for not
    doing so. Here is what the current RFC, RFC 5536, says (sec. 2.2):

    o All agents MUST generate header fields so that at least one space
    immediately follows the ':' separating the header field name and
    the header field body (for compatibility with deployed software,
    including NNTP [RFC3977] servers). News agents MAY accept header
    fields that do not contain the required space.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to D Finnigan on Tue Feb 7 08:57:47 2023
    D Finnigan <dog_cow@macgui.com> writes:

    Ah, nicely done. I'd only looked at RFC 5322. Didn't know the NNTP one differed.

    In general, the Usenet article format RFC is compatible with email but
    more restrictive, in some cases substantially more so. The email RFCs
    still allow for a bunch of legacy syntax that historically was never
    permitted by Usenet software and therefore is not allowed by the Usenet
    article format RFC (like comments in the middle of email addresses, or no
    space after the colon in headers).

    --
    Russ Allbery (eagle@eyrie.org) <https://www.eyrie.org/~eagle/>

    Please post questions rather than mailing me directly.
    <https://www.eyrie.org/~eagle/faqs/questions.html> explains why.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From D Finnigan@21:1/5 to Julian Bradfield on Tue Feb 7 16:18:37 2023
    Julian Bradfield <jcb@inf.ed.ac.uk> wrote:
    On 2023-02-07, D Finnigan <dog_cow@macgui.com> wrote:
    Jesse Rehmer <jesse.rehmer@blueworldhosting.com> wrote:
    I've been on auto-pilot for a bit, but started checking logs/reports today and
    notice an increasing number of articles rejected due to the following:

    439 No colon-space in "User-Agent:" header field

    When reviewing headers of the those messages there is a User-Agent: header, >>> but its blank and does not contain a space.

    It is probably against a RFC to accept these messages,

    The colon is merely a separator between field name and field value. The
    space afterward is only for human readability and is not mandated by the
    RFCs. The whitespace following a colon is part of the field value and is
    typically trimmed out.

    Check the RFCs; people assume the space following a colon is mandatory, but >> it's not.

    I think you should check the RFCs before telling people off for not
    doing so. Here is what the current RFC, RFC 5536, says (sec. 2.2):


    Ah, nicely done. I'd only looked at RFC 5322. Didn't know the NNTP one differed.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Julien_=c3=89LIE?=@21:1/5 to All on Wed Feb 8 13:36:29 2023
    Hi Jesse,

    They are legimately posted articles in my eyes, but maybe not
    others.
    [...]

    After examining a larger sampling of these articles, they
    appear to be noise and I'm not concerned about dropping them.

    OK, thanks for the confirmation!
    Seems like to be a spam-filtering feature :)

    I won't therefore take time to integrate a proper patch to allow empty
    header fields for a few set of header fields that are parsed by servers
    and clients (as proposed by Russ - not Path, Message-ID, Date,
    Cancel-Lock, Distribution, Control, etc.).


    I will keep this code stashed away though, it may come in handy in the future as lazy developers write more client software that isn't compliant.

    Do not hesitate to tell if you happen to find legitimate articles
    rejected because of that check.

    Also, does it concern many posters or always the same ones? Maybe there
    are all using the same newsreader, which should directly be fixed.
    If that's the case, they should be made aware of that bug in their
    messages (by responding to them in the newsgroup or by mail if they
    provide one).

    --
    Julien ÉLIE

    « – Où vous croyez-vous ici ?
    – Où je me trouve, je sais. » (Astérix)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)