• MIME header question

    From Adam H. Kerman@21:1/5 to All on Mon Jan 17 19:21:24 2022
    I asked in another newsgroup.

    Eduardo, in your interpretation of the RFCs is declaring 7 bit on
    Content Transfer Encoding in conflict with declaring UTF-8 as the
    character set?

    Logically it seems to me that the two headers should be set jointly and
    not UTF-8 without the use of non-ASCII characters if transfer encoding
    is marked as 7 bit.

    pine/alpine have always parsed for the lowest denomination character set despite the user's settings. If there are no non-ASCII characters, then
    the character set marking is US-ASCII and transfer encoding 7 bit.

    I don't know of another client that performs that parsing.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Eduardo Chappa@21:1/5 to Adam H. Kerman on Mon Jan 17 13:15:18 2022
    On Mon, 17 Jan 2022, Adam H. Kerman wrote:

    Eduardo, in your interpretation of the RFCs is declaring 7 bit on
    Content Transfer Encoding in conflict with declaring UTF-8 as the
    character set?

    Dear Adam,

    I do not think there is a conflict here. Let me say it in a different way.
    The Content-Tranfer-Encoding here just tells you how to process the data.
    If could have other values, such as base64, or quoted-printable, so the
    value tells you what to do with the data. In the case of 7 bit just
    interpret that 7 bit in the charset, in this case utf-8, which actually
    means US-ASCII. In other words

    7bit intersected with utf-8 = us-ascii,

    so you could write us-ascii for the charset in this case, or utf-8. It
    seems more like a question of style, not of correctness.

    Having said that, I prefer to use us-ascii in this case because more
    clients are likely to understand us-ascii instead of utf-8. Alpine did not
    get utf-8 handling until very late, while many other clients understood
    utf-8, so it was better for pine users to receive a message 7bit in
    us-ascii than 7-bit in utf-8, because Pine could not handle the latter.

    I doubt that there are Pine users still out there (although I can always
    be proven wrong) but it is better to be conservative here in my opinion.

    --
    Eduardo
    https://tinyurl.com/yc377wlh (web)
    http://repo.or.cz/alpine.git (Git)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Adam H. Kerman@21:1/5 to Eduardo Chappa on Mon Jan 17 21:49:50 2022
    Eduardo Chappa <chappa@washington.edu> wrote:
    On Mon, 17 Jan 2022, Adam H. Kerman wrote:

    Eduardo, in your interpretation of the RFCs is declaring 7 bit on
    Content Transfer Encoding in conflict with declaring UTF-8 as the
    character set?

    I do not think there is a conflict here. Let me say it in a different way. >The Content-Tranfer-Encoding here just tells you how to process the data.
    If could have other values, such as base64, or quoted-printable, so the
    value tells you what to do with the data. In the case of 7 bit just
    interpret that 7 bit in the charset, in this case utf-8, which actually
    means US-ASCII. In other words

    7bit intersected with utf-8 = us-ascii,

    so you could write us-ascii for the charset in this case, or utf-8. It
    seems more like a question of style, not of correctness.

    Thanks. This is why I asked you. I thought 7 bit was about the
    communication channel and not the capabilities of the client and display
    on the other end.

    If the display interprets MIME headers, does that mean the same 7-bit
    character is displayed ignoring the eighth bit or two characters are
    displayed in a UTF-8 double byte character? All this time, when my
    terminal emulation translation didn't match what was received (I have to
    change it manually), I thought I was changed the assumed character set,
    not the transfer encoding toggle.

    Having said that, I prefer to use us-ascii in this case because more
    clients are likely to understand us-ascii instead of utf-8. Alpine did not >get utf-8 handling until very late, while many other clients understood >utf-8, so it was better for pine users to receive a message 7bit in
    us-ascii than 7-bit in utf-8, because Pine could not handle the latter.

    I doubt that there are Pine users still out there (although I can always
    be proven wrong) but it is better to be conservative here in my opinion.

    I certainly agree with you.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Eduardo Chappa@21:1/5 to Adam H. Kerman on Mon Jan 17 16:29:21 2022
    On Mon, 17 Jan 2022, Adam H. Kerman wrote:

    Thanks. This is why I asked you. I thought 7 bit was about the
    communication channel and not the capabilities of the client and display
    on the other end.

    If the display interprets MIME headers, does that mean the same 7-bit character is displayed ignoring the eighth bit or two characters are displayed in a UTF-8 double byte character? All this time, when my
    terminal emulation translation didn't match what was received (I have to change it manually), I thought I was changed the assumed character set,
    not the transfer encoding toggle.

    Dear Adam,

    I never used the word display to refer to how the message actually
    displays on the screen. The headers tell the client what to do internally.
    For example, if the content-transfer-encoding were base64, then this tells
    the client to decode the encoded blob. Same with 7bit. It just tells to interpret the 7 bit it finds in the given charset. This will become a
    character on screen later on.

    I have to acknowledge that I do not understand completely what you are saying. There is no "transfer encoding toggle" in Alpine, nor there is a "assumed character set", so I am not exactly sure what you are referring
    to, but if I understand you correctly, you are asking what happens to
    multibyte characters. Unless you make changes to the default configuration
    in Alpine, Alpine will send to the terminal utf-8 codes, which the
    terminal will display if it is utf-8 capable. Do you have Alpine and our terminal configured differently?

    --
    Eduardo
    https://tinyurl.com/yc377wlh (web)
    http://repo.or.cz/alpine.git (Git)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to chappa@washington.edu on Mon Jan 17 23:36:00 2022
    It appears that Eduardo Chappa <chappa@washington.edu> said:
    On Mon, 17 Jan 2022, Adam H. Kerman wrote:

    Eduardo, in your interpretation of the RFCs is declaring 7 bit on
    Content Transfer Encoding in conflict with declaring UTF-8 as the
    character set?

    I'm not Eduardo, but it's clearly not valid. RFC 2045 says

    An encoding type of 7BIT requires that the body
    is already in a 7bit mail-ready representation.

    Needless to say, UTF-8 is not 7bit mail-ready. I can believe that
    some mail programs have tried to make sense of this, but it's utterly
    ad-hoc and whatever they do with it is wrong. Maybe stuff declared to
    be UTF-8 is in fact just ASCII in a particular message, but I wouldn't
    count on it.

    I doubt that there are Pine users still out there (although I can always
    be proven wrong) but it is better to be conservative here in my opinion.

    Probably not, although there are plenty of us Alpine users.

    R's,
    John


    --
    Regards,
    John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Adam H. Kerman@21:1/5 to Eduardo Chappa on Tue Jan 18 03:52:58 2022
    Eduardo Chappa <chappa@washington.edu> wrote:
    On Mon, 17 Jan 2022, Adam H. Kerman wrote:

    Thanks. This is why I asked you. I thought 7 bit was about the >>communication channel and not the capabilities of the client and display
    on the other end.

    If the display interprets MIME headers, does that mean the same 7-bit >>character is displayed ignoring the eighth bit or two characters are >>displayed in a UTF-8 double byte character? All this time, when my
    terminal emulation translation didn't match what was received (I have to >>change it manually), I thought I was changed the assumed character set,
    not the transfer encoding toggle.

    I never used the word display to refer to how the message actually
    displays on the screen. The headers tell the client what to do internally. >For example, if the content-transfer-encoding were base64, then this tells >the client to decode the encoded blob. Same with 7bit. It just tells to >interpret the 7 bit it finds in the given charset. This will become a >character on screen later on.

    I have to acknowledge that I do not understand completely what you are
    saying. There is no "transfer encoding toggle" in Alpine,

    Sorry to be unclear. I just meant that the standard allows a choice of
    encoding schemes, as you've been discussing.

    nor there is a "assumed character set",

    The user can name a character set in .pinerc. Isn't that for the composer
    as well as the display? If there are no non-ASCII characters, the MIME
    header declares ASCII no matter how the user set this feature.

    I liked the fact that alpine declares a lowest denomination character
    set.

    so I am not exactly sure what you are referring
    to, but if I understand you correctly, you are asking what happens to >multibyte characters. Unless you make changes to the default configuration
    in Alpine, Alpine will send to the terminal utf-8 codes, which the
    terminal will display if it is utf-8 capable. Do you have Alpine and our >terminal configured differently?

    I usually have to change the translation between ISO-8859-1 and UTF-8
    depending on what Usenet article I'm looking at. alpine isn't my
    newsreader. Also, in followup, I liked to get rid of the nonprinting characters; translation mismatch can make them visible. I post in ASCII whenever possible.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)