• Stripping html using mutt

    From bob prohaska@21:1/5 to All on Sun Oct 10 01:44:22 2021
    I use mutt via ssh and neither need nor want MIME enhancements,
    just the text. Can mutt display the text portion of the message
    alone? If the text is of interest, I can always go back for the
    formatting and MIME enhancements. It's common these days to get
    a few words of meaningful message buried in kilobytes of HTML.

    Thanks for reading, and any suggestions.

    bob prohaska

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Roger Bell_West@21:1/5 to bob prohaska on Sun Oct 10 03:18:39 2021
    On 2021-10-10, bob prohaska wrote:
    I use mutt via ssh and neither need nor want MIME enhancements,
    just the text. Can mutt display the text portion of the message
    alone? If the text is of interest, I can always go back for the
    formatting and MIME enhancements. It's common these days to get
    a few words of meaningful message buried in kilobytes of HTML.

    This sounds like what mutt does already: display the plain text, let
    you know the other parts are there. If you want the useful content out
    of an HTML message,

    auto_view text/html

    will use your mailcap (and a text-mode web browser such as elinks) to
    display HTML inline as though it were useful text. Searching for
    auto_view in the manual should be helpful.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Eric Pozharski@21:1/5 to All on Tue Oct 12 07:39:28 2021
    with <20211010041304.459075189967738@firedrake.org> Roger Bell_West
    wrote:
    On 2021-10-10, bob prohaska wrote:

    I use mutt via ssh and neither need nor want MIME enhancements, just
    the text. Can mutt display the text portion of the message alone? If
    the text is of interest, I can always go back for the formatting and
    MIME enhancements. It's common these days to get a few words of
    meaningful message buried in kilobytes of HTML.

    This sounds like what mutt does already: display the plain text, let
    you know the other parts are there. If you want the useful content out
    of an HTML message,

    auto_view text/html

    will use your mailcap (and a text-mode web browser such as elinks) to
    display HTML inline as though it were useful text. Searching for
    auto_view in the manual should be helpful.

    Also 'alternative_order' might be needed (unfortunately, this setting is somewhat vaguely documented, and I'm not bothered to find out what are defaults). Or, read whole story in the manual, search for "MIME Multipart/Alternative".

    --
    Torvalds' goal for Linux is very simple: World Domination
    Stallman's goal for GNU is even simpler: Freedom

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bob prohaska@21:1/5 to Eric Pozharski on Wed Oct 13 01:01:27 2021
    Eric Pozharski <whynot@pozharski.name> wrote:
    with <20211010041304.459075189967738@firedrake.org> Roger Bell_West
    wrote:
    On 2021-10-10, bob prohaska wrote:

    I use mutt via ssh and neither need nor want MIME enhancements, just
    the text. Can mutt display the text portion of the message alone? If
    the text is of interest, I can always go back for the formatting and
    MIME enhancements. It's common these days to get a few words of
    meaningful message buried in kilobytes of HTML.

    This sounds like what mutt does already: display the plain text, let
    you know the other parts are there. If you want the useful content out
    of an HTML message,

    auto_view text/html

    will use your mailcap (and a text-mode web browser such as elinks) to
    display HTML inline as though it were useful text. Searching for
    auto_view in the manual should be helpful.

    Also 'alternative_order' might be needed (unfortunately, this setting is somewhat vaguely documented, and I'm not bothered to find out what are defaults). Or, read whole story in the manual, search for "MIME Multipart/Alternative".


    I'm now reduced to reading the mutt manual 8-)

    I was hopeful there might be a switch in mutt that strips markup.
    Invoking a proper html interpreter is more than I think I need.

    Thanks for replying,

    bob prohaska

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bob prohaska@21:1/5 to All on Wed Oct 13 15:59:17 2021
    A bit of searching found these instructions for invoking lynx automatically:

    https://blog.deadlypenguin.com/2009/04/21/mutt-and-lynx/

    It seems to work, but acts automatically. The whole (and possibly futile)
    point of my enterprise is to avoid involuntary invocation of additional software while viewing untrusted email.

    Is there some way to at least give myself a choice? I tried deleting
    auto_view from the .muttrc line, but that triggered an error message.
    Is there a command that prompts for permission?

    Automatically stripping html would be ideal, followed by an option to
    invoke an html viewer.

    Thanks for reading!

    bob prohaska

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to bob prohaska on Wed Oct 13 16:53:06 2021
    bob prohaska <bp@www.zefox.net> wrote:
    A bit of searching found these instructions for invoking lynx automatically:

    https://blog.deadlypenguin.com/2009/04/21/mutt-and-lynx/

    It seems to work, but acts automatically. The whole (and possibly futile) point of my enterprise is to avoid involuntary invocation of additional software while viewing untrusted email.

    Mutt contains neither an HTML viewer nor a HTML stripper. Mutt's
    internal viewer simply shows the the message part you've chosen to view
    as text data. If that message part is an HTML part, then you'd be
    viewing the raw HTML (HTML being a superset of "text" data).

    Is there some way to at least give myself a choice?

    You can set auto view to the text/plain content part, which will give
    you the plain text part of the message. Then you can use the "v"
    command to see the mime parts, and selectively view individual parts as desired.

    I tried deleting auto_view from the .muttrc line, but that triggered
    an error message. Is there a command that prompts for permission?

    I do not know of one built in to Mutt, but you could setup a bash
    script that is invoked by mutt, that asks permission (look up the
    "read" command and the -p option thereto) and then either invokes the
    html to text conversion program (lynx et al.) or does not invoke it,
    based upon your answer to the prompt.

    Automatically stripping html would be ideal, followed by an option to
    invoke an html viewer.

    Note that "automatically stripping" would itself involve "involuntary invocation of additional software while viewing untrusted email",
    violating your wish not to do so.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Eric Pozharski@21:1/5 to bob prohaska on Wed Oct 13 10:41:47 2021
    with <sk5b57$4l3$1@dont-email.me> bob prohaska wrote:
    Eric Pozharski <whynot@pozharski.name> wrote:
    with <20211010041304.459075189967738@firedrake.org> Roger Bell_West
    wrote:
    On 2021-10-10, bob prohaska wrote:

    I use mutt via ssh and neither need nor want MIME enhancements, just
    the text. Can mutt display the text portion of the message alone?
    *SKIP*
    This sounds like what mutt does already: display the plain text, let
    you know the other parts are there. If you want the useful content out
    of an HTML message,

    auto_view text/html

    will use your mailcap (and a text-mode web browser such as elinks) to
    display HTML inline as though it were useful text. Searching for
    auto_view in the manual should be helpful.

    Also 'alternative_order' might be needed
    *SKIP*
    Or, read whole story in the manual, search for "MIME
    Multipart/Alternative".
    I'm now reduced to reading the mutt manual 8-)

    Absolutely not. What you should read is mimeview(1), mailcap(5), and
    (darn, I've totally missed it ten years ago) mailcap.order(5).

    I was hopeful there might be a switch in mutt that strips markup.

    If mutt would begin have such switch that would be a good sign to start
    looking for alternatives.

    Invoking a proper html interpreter is more than I think I need.

    Web-designers disagree that lynx is proper html interpreter, I guess.

    *CUT*

    --
    Torvalds' goal for Linux is very simple: World Domination
    Stallman's goal for GNU is even simpler: Freedom

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Eike Rathke@21:1/5 to All on Wed Oct 13 23:27:37 2021
    * Eike Rathke, 2021-10-13 23:25 UTC:
    and in ~/.mailcap have
    text/html; /usr/bin/elinks ...
    Install the elinks package.

    Oh and btw, using elinks here because it has a decent html table
    handling, which lynx does not have at all.

    Eike

    --
    OpenPGP/GnuPG encrypted mail preferred in all private communication.
    GPG key 0x6A6CD5B765632D3A - 2265 D7F3 A7B0 95CC 3918 630B 6A6C D5B7 6563 2D3A Use LibreOffice! https://www.libreoffice.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Eike Rathke@21:1/5 to All on Wed Oct 13 23:25:20 2021
    * bob prohaska, 2021-10-10 01:44 UTC:
    I use mutt via ssh and neither need nor want MIME enhancements,
    just the text. Can mutt display the text portion of the message
    alone?

    Yes it can. Note though that for mixed multipart messages often the
    text/plain part does not match the text/html part, especially in mails
    from shitty shops and "enterprise grade" mail systems. So it may be
    desirable to be able to choose which.

    In your muttrc have

    # use mailcap entry for defined types
    unset implicit_autoview
    unauto_view *
    auto_view text/html
    alternative_order text/plain text text/html

    and in ~/.mailcap have

    text/html; /usr/bin/elinks -localhost 1 -no-connect 1 -force-html -dump %s; copiousoutput; description=HTML Text; nametemplate=%s.html

    (all on one line).

    Install the elinks package. The muttrc alternative_order determines
    which part is preferably displayed. The mailcap entry produces a textual
    view of the text/html part if there is one present and that then is
    displayed by mutt. In the index view or while viewing a message you can
    still press 'v' and from the multiparts select either the text/plain or text/html part to view.

    Eike

    --
    OpenPGP/GnuPG encrypted mail preferred in all private communication.
    GPG key 0x6A6CD5B765632D3A - 2265 D7F3 A7B0 95CC 3918 630B 6A6C D5B7 6563 2D3A Use LibreOffice! https://www.libreoffice.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bob prohaska@21:1/5 to Rich on Thu Oct 14 00:55:07 2021
    Rich <rich@example.invalid> wrote:

    Note that "automatically stripping" would itself involve "involuntary invocation of additional software while viewing untrusted email",
    violating your wish not to do so.

    Hoist by my own petard 8-)

    Can lynx be invoked from the view menu after selecting the subpart?

    The idea would be to view everything as plain text, then back up and
    apply lynx to the selected sub-part if it seems worthwhile.

    I can start lynx from the view menu, but it is oblivious to the
    selected subpart.

    Thanks for your patience!

    bob prohaska

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Eric Pozharski@21:1/5 to bob prohaska on Thu Oct 14 10:51:03 2021
    with <sk7v5a$pjf$1@dont-email.me> bob prohaska wrote:
    Rich <rich@example.invalid> wrote:

    *SKIP*
    Can lynx be invoked from the view menu after selecting the subpart?

    Yes.

    The idea would be to view everything as plain text, then back up and
    apply lynx to the selected sub-part if it seems worthwhile.

    Yes, see 'alternative_order'.

    I can start lynx from the view menu, but it is oblivious to the
    selected subpart.

    This description isn't clear, however it (still) suggests your mailcap
    setup isn't in desired state. I just found out The Mailcap Mechanism is
    fscked up with (unknown yet) additions on part of (unknown yet)
    distribution. That poses Teh Question: can we stop bitching around and
    start diagnosing?

    *CUT*

    --
    Torvalds' goal for Linux is very simple: World Domination
    Stallman's goal for GNU is even simpler: Freedom

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bob prohaska@21:1/5 to Eike Rathke on Fri Oct 15 01:21:18 2021
    Eike Rathke <erack+nutznetz.p@posteo.de> wrote:
    * bob prohaska, 2021-10-10 01:44 UTC:
    I use mutt via ssh and neither need nor want MIME enhancements,
    just the text. Can mutt display the text portion of the message
    alone?

    Yes it can. Note though that for mixed multipart messages often the text/plain part does not match the text/html part, especially in mails
    from shitty shops and "enterprise grade" mail systems. So it may be
    desirable to be able to choose which.

    In your muttrc have

    # use mailcap entry for defined types
    unset implicit_autoview
    unauto_view *
    auto_view text/html
    alternative_order text/plain text text/html

    and in ~/.mailcap have

    text/html; /usr/bin/elinks -localhost 1 -no-connect 1 -force-html -dump %s; copiousoutput; description=HTML Text; nametemplate=%s.html

    (all on one line).


    This combination seems to work nicely. If I just select the whole
    message and hit return, mutt displays the plain text. If I use v
    to list the attachments, select text/html and hit return, the
    browser fires up and shows me the formatted text. That's a bit
    nicer than I was originally looking for.

    Install the elinks package. The muttrc alternative_order determines
    which part is preferably displayed. The mailcap entry produces a textual
    view of the text/html part if there is one present and that then is
    displayed by mutt. In the index view or while viewing a message you can
    still press 'v' and from the multiparts select either the text/plain or text/html part to view.


    elinks is turning out to be a problem. It built and installed without complaint, but doesn't run correctly. This is on a Raspberry Pi2B running FreeBSD 12.2. The ports tree is stale, I'll update it and try again later.
    For now lynx is good enough.

    Thank you very much!

    bob prohaska

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ant@21:1/5 to bob prohaska on Fri Oct 15 05:45:12 2021
    bob prohaska <bp@www.zefox.net> wrote:
    Eike Rathke <erack+nutznetz.p@posteo.de> wrote:
    * bob prohaska, 2021-10-10 01:44 UTC:
    I use mutt via ssh and neither need nor want MIME enhancements,
    just the text. Can mutt display the text portion of the message
    alone?

    Yes it can. Note though that for mixed multipart messages often the text/plain part does not match the text/html part, especially in mails
    from shitty shops and "enterprise grade" mail systems. So it may be desirable to be able to choose which.

    In your muttrc have

    # use mailcap entry for defined types
    unset implicit_autoview
    unauto_view *
    auto_view text/html
    alternative_order text/plain text text/html

    and in ~/.mailcap have

    text/html; /usr/bin/elinks -localhost 1 -no-connect 1 -force-html -dump %s; copiousoutput; description=HTML Text; nametemplate=%s.html

    (all on one line).


    This combination seems to work nicely. If I just select the whole
    message and hit return, mutt displays the plain text. If I use v
    to list the attachments, select text/html and hit return, the
    browser fires up and shows me the formatted text. That's a bit
    nicer than I was originally looking for.

    Install the elinks package. The muttrc alternative_order determines
    which part is preferably displayed. The mailcap entry produces a textual view of the text/html part if there is one present and that then is displayed by mutt. In the index view or while viewing a message you can still press 'v' and from the multiparts select either the text/plain or text/html part to view.


    elinks is turning out to be a problem. It built and installed without complaint, but doesn't run correctly. This is on a Raspberry Pi2B running FreeBSD 12.2. The ports tree is stale, I'll update it and try again later. For now lynx is good enough.

    Bob, try Links. eLinks is based on it. :)
    --
    Doyers! :D So many brokenesses, oldnesses, leaks, illnesses, videos, spams, issues, software updates, games, sins, tiredness, busyness, etc. Dang colony life! D:
    Note: A fixed width font (Courier, Monospace, etc.) is required to see this signature correctly.
    /\___/\ Ant(Dude) @ http://aqfl.net & http://antfarm.home.dhs.org.
    / /\ /\ \ Please nuke ANT if replying by e-mail.
    | |o o| |
    \ _ /
    ( )

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bob prohaska@21:1/5 to Ant on Sat Oct 16 02:01:38 2021
    Ant <ant@zimage.comant> wrote:
    For now lynx is good enough.

    Bob, try Links. eLinks is based on it. :)

    It's in the FreeBSD ports collection, so that should be easy.

    A browser is really too capable for my purposes. Browsers, AIUI,
    can spawn subordinate programs on the user's behalf, which I'd
    like to avoid.

    There is a port called html2text, which I know nothing about.
    If true to its name, that might come closer to scraping off
    the tags so I can see what the email tries to do, without it
    being able to actually make good on the goal.

    This thread has taught me the essentials, which turn out to be
    rather arcane. Now I have to decide just how paranoid to be
    about unsolicited email.

    Thanks to all who's educated me!

    bob prohaska

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jorgen Grahn@21:1/5 to bob prohaska on Sat Oct 16 20:13:39 2021
    On Sat, 2021-10-16, bob prohaska wrote:
    Ant <ant@zimage.comant> wrote:
    For now lynx is good enough.

    Bob, try Links. eLinks is based on it. :)

    It's in the FreeBSD ports collection, so that should be easy.

    A browser is really too capable for my purposes. Browsers, AIUI,
    can spawn subordinate programs on the user's behalf, which I'd
    like to avoid.

    Well, you need a secure browser which doesn't e.g. let mails "phone
    home". I don't know which of the popular text-mode browsers (lynx,
    links, elinks, w3m; any others?) do that well.

    There is a port called html2text, which I know nothing about.
    If true to its name, that might come closer to scraping off
    the tags so I can see what the email tries to do, without it
    being able to actually make good on the goal.

    Most HTML mails would be quite unreadable if you just stripped off the
    tags. But I see what you mean: a program which just takes a HTML file
    and renders it as text is less likely to let the mail /do/ anything,
    compared to a browser, even a browser in "dump" mode.

    Personally I let mutt call w3m to render HTML mail, and hope it
    protects my privacy. I don't look at the text version of the mail
    (i.e. the other half of the multipart/alternative) since it's
    usually useless. Then I curse w3m because it doesn't show the
    links in the mail, and so I end up using mutt's view-text command
    to search the HTML (and pages of useless CSS) for that link I
    want. The whole thing is less than ideal, but if the sender
    cannot bother to communicate well, perhaps it wasn't so important
    that I read their mails after all.

    This thread has taught me the essentials, which turn out to be
    rather arcane. Now I have to decide just how paranoid to be
    about unsolicited email.

    Thanks to all who's educated me!

    bob prohaska

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Eike Rathke@21:1/5 to All on Sat Oct 16 23:30:40 2021
    * Jorgen Grahn, 2021-10-16 20:13 UTC:
    Well, you need a secure browser which doesn't e.g. let mails "phone
    home". I don't know which of the popular text-mode browsers (lynx,
    links, elinks, w3m; any others?) do that well.

    The elinks command line (should work for links as well) I posted
    prevents exactly that with the options -localhost 1 -no-connect 1

    Eike

    --
    OpenPGP/GnuPG encrypted mail preferred in all private communication.
    GPG key 0x6A6CD5B765632D3A - 2265 D7F3 A7B0 95CC 3918 630B 6A6C D5B7 6563 2D3A Use LibreOffice! https://www.libreoffice.org/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)