• question:
    of all NBSP ZWNJ, why?

    From Eli the Bearded@21:1/5 to All on Thu Mar 25 23:39:00 2021
    I read HTML email that comes to me with the help of lynx. I've noticed
    some messages have one or more <DIV>s filled with alternating
    non-breaking white space / zero-width non-joiner. What's the point?

    Example from a message today, still quoted-printable encoded, and
    including the <P> before and after the <DIV> (and tracking pixel mucked
    with):

    <p style=3D"max-height: 0; font-size: 0; l=
    ine-height: 0; margin: 0; overflow: hidden;">Assembly instructions, example=
    code + more!</p><div style=3D"display: none; width: 0px; height: 0px; max-= height: 0px; overflow: hidden;">&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C</div><p style=3D"max-height: 0; font-size=
    : 0; line-height: 0; margin: 0; overflow: hidden;"><img alt=3D"" border=3D"=
    0" src=3D"https://ned.soundestlink.com/transactional/track/abcdef0123456789= 123b4186?signature=3Dabcdef0123456789e3d61b0a713e17907cf67babcdef0123456789= 57737ef591" width=3D"1" height=3D"1" /></p>

    Note that =E2=80=8C is the octet-by-octet quoted-printable version of
    the U+200C codepoint UTF-8 encoded.

    (Lynx does not honor "display: none" and these weird blocks of
    whitespace show up in various post-lynx operations, like quoted plain text replies.)

    ZWNJ is intended to suppress ligature output when it might otherwise be
    used by a naive typesetter. I'm unclear what effect is expected when
    joining whitespace to whitespace.

    Elijah
    ------
    using &zwnj; would be shorter than the QP version

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David E. Ross@21:1/5 to Eli the Bearded on Thu Mar 25 23:36:42 2021
    On 3/25/2021 4:39 PM, Eli the Bearded wrote:
    I read HTML email that comes to me with the help of lynx. I've noticed
    some messages have one or more <DIV>s filled with alternating
    non-breaking white space / zero-width non-joiner. What's the point?

    Example from a message today, still quoted-printable encoded, and
    including the <P> before and after the <DIV> (and tracking pixel mucked with):

    <p style=3D"max-height: 0; font-size: 0; l=
    ine-height: 0; margin: 0; overflow: hidden;">Assembly instructions, example=
    code + more!</p><div style=3D"display: none; width: 0px; height: 0px; max-= height: 0px; overflow: hidden;">&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C&nbsp;=E2=80= =8C&nbsp;=E2=80=8C&nbsp;=E2=80=8C</div><p style=3D"max-height: 0; font-size= : 0; line-height: 0; margin: 0; overflow: hidden;"><img alt=3D"" border=3D"= 0" src=3D"https://ned.soundestlink.com/transactional/track/abcdef0123456789= 123b4186?signature=3Dabcdef0123456789e3d61b0a713e17907cf67babcdef0123456789= 57737ef591" width=3D"1" height=3D"1" /></p>

    Note that =E2=80=8C is the octet-by-octet quoted-printable version of
    the U+200C codepoint UTF-8 encoded.

    (Lynx does not honor "display: none" and these weird blocks of
    whitespace show up in various post-lynx operations, like quoted plain text replies.)

    ZWNJ is intended to suppress ligature output when it might otherwise be
    used by a naive typesetter. I'm unclear what effect is expected when
    joining whitespace to whitespace.

    Elijah
    ------
    using &zwnj; would be shorter than the QP version

    Unfortunately, E-mail applications often generate very poor HTML. My
    most recent analysis was done almost two years ago. At that time, I
    found that HTML-formatted messages contain an average of 7.3 HTML syntax
    errors per KB of file size. I conducted similar analyses in 2008, 2010,
    2012, and 2015 and found little improvement. Actually, my 2019 analysis
    showed a greater number of errors than then 2015 analysis. In my
    analyses, I did not address how the HTML errors created by a particular
    E-mail application affect a different E-mail application.

    My analyses also addressed bloat, which is the increase in the size of
    an HTML-formatted message to convey the same textual content as a
    plain-text message. Bloat can result from such things as (1) what you
    are seeing, (2) 2-part messages that contain both plain-text and HTML-formatting, and (3) unnecessary but non-erroneous HTML markups.
    Bloat factor is a measure of bloat, computed by dividing the size of the HTML-formatted message by the size of the plain-text message that has
    the same content. In 2019, I found the average bloat factor for
    HTML-formatted messages was 16.0 times the size of the equivalent
    plain-text content. This was a larger bloat factor than in any of the
    four prior analyses. That is, newer E-mail applications are producing
    worse bloat than older applications.

    A more detailed report of my 2019 analysis is at <http://www.rossde.com/internet/ASCIIvsHTML.html>.

    --

    David E. Ross
    <http://www.rossde.com/>.

    The only reason we have so many laws is that not enough people will do
    the right thing. (© 1997 by David Ross)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Phillip Helbig (undress to reply@21:1/5 to *@eli.users.panix.com on Fri Mar 26 09:21:34 2021
    In article <eli$2103251932@qaz.wtf>, Eli the Bearded
    <*@eli.users.panix.com> writes:

    I read HTML email that comes to me with the help of lynx. I've noticed
    some messages have one or more <DIV>s filled with alternating
    non-breaking white space / zero-width non-joiner. What's the point?

    Short answer: the people sending the message don't understand the
    difference between content and presentation.

    using &zwnj; would be shorter than the QP version

    Yes, but brevity is the soul of wit, not of modern email.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jukka K. Korpela@21:1/5 to Eli the Bearded on Fri Mar 26 14:02:55 2021
    Eli the Bearded wrote:

    I read HTML email that comes to me with the help of lynx. I've noticed
    some messages have one or more <DIV>s filled with alternating
    non-breaking white space / zero-width non-joiner. What's the point?

    Without the styling, my guess would be: creation of an element with no
    visible content but some forced width, with the assumption that without intervening ZWNJ, a rendering engine might treat a sequence of NBSP as collapsible white space (an odd assumption, but some renderers do odd
    things).

    With the styling that makes the element non-rendered, zero width, zero
    height, I guess that guess was wrong. Or maybe they intentionally want
    non-CSS rendering differ from CSS-enabled rendering.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lewis@21:1/5 to David E. Ross on Fri Mar 26 12:21:30 2021
    In message <s3jvdr$1ueq$1@gioia.aioe.org> David E. Ross <not_me@not_there.invalid> wrote:
    That is, newer E-mail applications are producing worse bloat than
    older applications.

    IME most HTML email is not created within an email application at all,
    it is created elsewhere and them imply spammed out through email.

    It's a shame that email lists allow HTML messages at all as so many are
    so badly formatted and email clients tend to lack the tools to deal with
    bad HTML

    --
    A golem of Margo? A Margolem.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Eli the Bearded@21:1/5 to helbig@asclothestro.multivax.de on Fri Mar 26 18:33:15 2021
    In comp.infosystems.www.authoring.html,
    Phillip Helbig (undress to reply) <helbig@asclothestro.multivax.de> wrote:

    The rare nested comments format of From: line...

    Eli the Bearded <*@eli.users.panix.com> writes:
    I read HTML email that comes to me with the help of lynx. I've noticed
    some messages have one or more <DIV>s filled with alternating
    non-breaking white space / zero-width non-joiner. What's the point?
    Short answer: the people sending the message don't understand the
    difference between content and presentation.

    Yeah, well, if they understood (or cared) I expect they'd have a
    text/plain version, too, so I could ignore the text/html one.

    using &zwnj; would be shorter than the QP version
    Yes, but brevity is the soul of wit, not of modern email.

    It is amusing, in a sad way, that both Twitter and Mastodon send me
    email notifications that are each more than 40x (and close to 50x) times
    larger than the maximum message length in their services.

    Thanks everyone for the speculation about what's going on. It sounding
    like is not some well known trick but just naive blundering.

    Elijah
    ------
    or cargo-cult blundering

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)