• Erratic behaviour of "less -r" if run from the GUI.

    From Ottavio Caruso@21:1/5 to All on Thu Oct 20 14:46:27 2022
    Hi,

    I hope I can find the right language to make myself clear.

    I have accidentally discovered that I can read a pdf file from the
    command line just by using "less -r":

    $ less -r file.pdf

    works beautifully and even maintains some formatting.

    My shell is bash and my X terminal is mate-terminal: https://manpages.debian.org/bullseye/mate-terminal/mate-terminal.1.en.html

    If I launch from the terminal:
    $ mate-terminal --full-screen -x /usr/bin/less -r Downloads/08\ Karzoff.pdf

    This will open a new fullscreen window, rendering the pdf as intended. https://i.imgur.com/R2opBmj.png

    If I launch Caja (GUI file manager), right-click on the pdf > "open
    with other application" > "use a custom command > mate-terminal
    --full-screen -x /usr/bin/less -r > Open

    this will display garbage, that is the pdf source code: https://i.imgur.com/eAynhee.png

    Any clue?




    --
    Ottavio Caruso

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Ottavio Caruso on Thu Oct 20 15:25:54 2022
    On Thu, 20 Oct 2022 14:46:27 +0000, Ottavio Caruso wrote:

    Hi,

    I hope I can find the right language to make myself clear.

    I have accidentally discovered that I can read a pdf file from the
    command line just by using "less -r":

    $ less -r file.pdf

    works beautifully and even maintains some formatting.

    My shell is bash and my X terminal is mate-terminal: https://manpages.debian.org/bullseye/mate-terminal/mate-terminal.1.en.html

    If I launch from the terminal:
    $ mate-terminal --full-screen -x /usr/bin/less -r Downloads/08\ Karzoff.pdf

    This will open a new fullscreen window, rendering the pdf as intended. https://i.imgur.com/R2opBmj.png

    If I launch Caja (GUI file manager), right-click on the pdf > "open
    with other application" > "use a custom command > mate-terminal
    --full-screen -x /usr/bin/less -r > Open

    this will display garbage, that is the pdf source code: https://i.imgur.com/eAynhee.png

    Any clue?

    Yes. Look to the behaviour of your mate-terminal, when launched from
    the desktop vs launched from Caja. The behaviour you see when launched
    from Caja is the expected behaviour (PDFs contain PDF language instructions, binary PDF instruction data (sometimes), optional binary display data,
    and optional display text. I would not expect less(1), when run in a
    terminal, to properly format the optional display text of a PDF and
    /not/ display (or attempt to display) any of the binary data or PDF instructions.

    The results you get when launching from Caja is what I expect to see
    from an attempt to less(1) a PDF file.

    FWIW, why aren't you using pdftotext(1), which expressly designed
    to extract and format the display text of a PDF file?
    --
    Lew Pitcher
    "In Skills, We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ottavio Caruso@21:1/5 to Lew Pitcher on Thu Oct 20 15:37:51 2022
    On 20/10/2022 15:25, Lew Pitcher wrote:
    I would not expect less(1), when run in a
    terminal, to properly format the optional display text of a PDF and
    /not/ display (or attempt to display) any of the binary data or PDF instructions.

    But it does, as evidenced by the screenshot in my OP.


    The results you get when launching from Caja is what I expect to see
    from an attempt to less(1) a PDF file.

    Again, see screenshot 1.


    FWIW, why aren't you using pdftotext(1), which expressly designed
    to extract and format the display text of a PDF file?

    Because pdftotext is not very scriptable and doesn't work as smoothly as
    "less -r", some control characters are still left over.

    Look to the behaviour of your mate-terminal, when launched from
    the desktop vs launched from Caja.

    I expect mate-terminal to behave the same way if launched from the
    terminal or from a file manager.

    My feeling is that the "-r" argument doesn't get processed if launched
    from the file manager.

    --
    Ottavio Caruso

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Keith Thompson@21:1/5 to Ottavio Caruso on Thu Oct 20 08:52:32 2022
    Ottavio Caruso <ottavio2006-usenet2012@yahoo.com> writes:
    I hope I can find the right language to make myself clear.

    I have accidentally discovered that I can read a pdf file from the
    command line just by using "less -r":

    $ less -r file.pdf

    works beautifully and even maintains some formatting.

    That's certainly not my experience.

    You probably have the $LESSOPEN environment variable set to invoke
    something like pdftotext on the input before displaying it.

    `man less` and search for "INPUT PREPROCESSOR".

    My shell is bash and my X terminal is mate-terminal: https://manpages.debian.org/bullseye/mate-terminal/mate-terminal.1.en.html

    If I launch from the terminal:
    $ mate-terminal --full-screen -x /usr/bin/less -r Downloads/08\ Karzoff.pdf

    This will open a new fullscreen window, rendering the pdf as intended. https://i.imgur.com/R2opBmj.png

    If I launch Caja (GUI file manager), right-click on the pdf > "open
    with other application" > "use a custom command > mate-terminal
    --full-screen -x /usr/bin/less -r > Open

    this will display garbage, that is the pdf source code: https://i.imgur.com/eAynhee.png

    Any clue?

    You probably have $LESSOPEN set in your interactive shell but not in
    your file manager. In your GUI command, try changing "/usr/bin/less -r"
    to "env LESSOPEN=... /usr/bin/less -r".

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    Working, but not speaking, for Philips
    void Void(void) { Void(); } /* The recursive call of the void */

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Ottavio Caruso on Thu Oct 20 16:04:43 2022
    On Thu, 20 Oct 2022 15:37:51 +0000, Ottavio Caruso wrote:

    On 20/10/2022 15:25, Lew Pitcher wrote:
    I would not expect less(1), when run in a
    terminal, to properly format the optional display text of a PDF and
    /not/ display (or attempt to display) any of the binary data or PDF
    instructions.

    But it does, as evidenced by the screenshot in my OP.


    The results you get when launching from Caja is what I expect to see
    from an attempt to less(1) a PDF file.

    Again, see screenshot 1.

    Try your less(1) command in xterm, and see what happens.
    For me, I get a warning that "... may be a binary file. See it anyway?"
    and then, if I answer Y, I get the contents of such a file, binary and
    PDF instructions included. This in xterm, konsole, and xfce-term, on a
    PDF file I know was derived from text, and which pdftotext(1) properly
    displays as text.

    So, as far as /your/ setup is concerned, your "Launch mate-terminal and less from Caja" results agrees with my results run from three different
    terminal programs, and your "Launch mate-terminal and less from the desktop" does not.

    ISTM that the difference you see is in how mate-terminal behaves when
    launched from the Mate desktop versus how it behaves when launched by
    Caja. And the Caja behaviour is the "correct" (as in expected) behaviour.



    FWIW, why aren't you using pdftotext(1), which expressly designed
    to extract and format the display text of a PDF file?

    Because pdftotext is not very scriptable and doesn't work as smoothly as "less -r", some control characters are still left over.

    Huh? I beg to differ.
    But that's a separate conversation.


    Look to the behaviour of your mate-terminal, when launched from
    the desktop vs launched from Caja.

    I expect mate-terminal to behave the same way if launched from the
    terminal or from a file manager.

    My feeling is that the "-r" argument doesn't get processed if launched
    from the file manager.

    My feeling is that, if the -r argument isn't processed, you should see
    garbage on the terminal, and if it /is/ processed, you should see
    /different/ garbage on the terminal. Given that, your "launch from
    desktop" behaves as if something is filtering the binary data out /before/ less(1) sees it.

    HTH
    --
    Lew Pitcher
    "In Skills, We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Keith Thompson on Thu Oct 20 16:10:40 2022
    On Thu, 20 Oct 2022 08:52:32 -0700, Keith Thompson wrote:

    Ottavio Caruso <ottavio2006-usenet2012@yahoo.com> writes:
    I hope I can find the right language to make myself clear.

    I have accidentally discovered that I can read a pdf file from the
    command line just by using "less -r":

    $ less -r file.pdf

    works beautifully and even maintains some formatting.

    That's certainly not my experience.

    You probably have the $LESSOPEN environment variable set to invoke
    something like pdftotext on the input before displaying it.

    `man less` and search for "INPUT PREPROCESSOR".

    My shell is bash and my X terminal is mate-terminal:
    https://manpages.debian.org/bullseye/mate-terminal/mate-terminal.1.en.html >>
    If I launch from the terminal:
    $ mate-terminal --full-screen -x /usr/bin/less -r Downloads/08\ Karzoff.pdf >>
    This will open a new fullscreen window, rendering the pdf as intended.
    https://i.imgur.com/R2opBmj.png

    If I launch Caja (GUI file manager), right-click on the pdf > "open
    with other application" > "use a custom command > mate-terminal
    --full-screen -x /usr/bin/less -r > Open

    this will display garbage, that is the pdf source code:
    https://i.imgur.com/eAynhee.png

    Any clue?

    You probably have $LESSOPEN set in your interactive shell but not in
    your file manager. In your GUI command, try changing "/usr/bin/less -r"
    to "env LESSOPEN=... /usr/bin/less -r".

    I don't use less(1) often, and didn't know about the LESSOPEN envar.
    But, it jibes with my conclusion (stated in a followup to the OP) that something is being set differently between the Mate desktop and the Caja launcher. Likely, the desktop sets LESSOPEN to a filter that excludes
    the non-essentials of a PDF (like, say, pdftotext(1)) but the Caja
    launcher does not.

    That's my guess, anyway.
    --
    Lew Pitcher
    "In Skills, We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Bit Twister on Thu Oct 20 17:06:05 2022
    On Thu, 20 Oct 2022 11:57:46 -0500, Bit Twister wrote:

    On Thu, 20 Oct 2022 15:37:51 +0000, Ottavio Caruso wrote:
    On 20/10/2022 15:25, Lew Pitcher wrote:


    FWIW, why aren't you using pdftotext(1), which expressly designed
    to extract and format the display text of a PDF file?

    Because pdftotext is not very scriptable and doesn't work as smoothly as
    "less -r", some control characters are still left over.

    Hmmm, I use it in several scripts to parse data from bank and IRS pdfs. Example
    pdftotext -layout $pdf_fn $TMPDIR/xx

    And, in the context of the OP's problem as he currently sees it,
    #!/bin/bash
    # Usage: lesspdfpipe.sh <pdf_file_name>
    if [ "$1" ]
    then
    TARGET=$(tempfile)
    if pdftotext -layout "$1" "$TARGET"
    then
    cat "$TARGET"
    fi
    fi

    Example use:
    LESSOPEN='|lesspdfpipe.sh %s' mate-terminal \
    --full-screen \
    -x /usr/bin/less -r some_random.pdf

    --
    Lew Pitcher
    "In Skills, We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Lew Pitcher on Thu Oct 20 17:08:09 2022
    On Thu, 20 Oct 2022 17:06:05 +0000, Lew Pitcher wrote:

    On Thu, 20 Oct 2022 11:57:46 -0500, Bit Twister wrote:

    On Thu, 20 Oct 2022 15:37:51 +0000, Ottavio Caruso wrote:
    On 20/10/2022 15:25, Lew Pitcher wrote:


    FWIW, why aren't you using pdftotext(1), which expressly designed
    to extract and format the display text of a PDF file?

    Because pdftotext is not very scriptable and doesn't work as smoothly as >>> "less -r", some control characters are still left over.

    Hmmm, I use it in several scripts to parse data from bank and IRS pdfs. Example
    pdftotext -layout $pdf_fn $TMPDIR/xx

    And, in the context of the OP's problem as he currently sees it,
    #!/bin/bash
    # Usage: lesspdfpipe.sh <pdf_file_name>
    if [ "$1" ]
    then
    TARGET=$(tempfile)
    if pdftotext -layout "$1" "$TARGET"
    then
    cat "$TARGET"
    fi
    fi

    Or, better yet

    #!/bin/bash
    # Usage: lesspdfpipe.sh <pdf_file_name>
    if [ "$1" ]
    then
    pdftotext -layout "$1" -
    fi



    --
    Lew Pitcher
    "In Skills, We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ottavio Caruso@21:1/5 to Keith Thompson on Thu Oct 20 16:35:07 2022
    On 20/10/2022 15:52, Keith Thompson wrote:
    Ottavio Caruso <ottavio2006-usenet2012@yahoo.com> writes:
    I hope I can find the right language to make myself clear.

    I have accidentally discovered that I can read a pdf file from the
    command line just by using "less -r":

    $ less -r file.pdf

    works beautifully and even maintains some formatting.

    That's certainly not my experience.

    You probably have the $LESSOPEN environment variable set to invoke
    something like pdftotext on the input before displaying it.

    `man less` and search for "INPUT PREPROCESSOR".

    Thanks. We're getting closer.

    In terminal:
    $ echo $LESSOPEN
    | /usr/bin/lesspipe %s

    So, this is the command I gave from Caja:

    env LESSOPEN="| /usr/bin/lesspipe %s" mate-terminal --full-screen -x /usr/bin/less -r

    Result:

    LESSOPEN ignored: must contain exactly one %s

    Then I tried (without the %s)

    env LESSOPEN="| /usr/bin/lesspipe" mate-terminal --full-screen -x /usr/bin/less -r

    Same as above.



    You probably have $LESSOPEN set in your interactive shell but not in
    your file manager. In your GUI command, try changing "/usr/bin/less -r"
    to "env LESSOPEN=... /usr/bin/less -r".



    --
    Ottavio Caruso

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Lew Pitcher on Thu Oct 20 17:12:14 2022
    On Thu, 20 Oct 2022 17:11:10 +0000, Lew Pitcher wrote:

    On Thu, 20 Oct 2022 17:08:09 +0000, Lew Pitcher wrote:

    On Thu, 20 Oct 2022 17:06:05 +0000, Lew Pitcher wrote:

    On Thu, 20 Oct 2022 11:57:46 -0500, Bit Twister wrote:

    On Thu, 20 Oct 2022 15:37:51 +0000, Ottavio Caruso wrote:
    On 20/10/2022 15:25, Lew Pitcher wrote:


    FWIW, why aren't you using pdftotext(1), which expressly designed
    to extract and format the display text of a PDF file?

    Because pdftotext is not very scriptable and doesn't work as smoothly as >>>>> "less -r", some control characters are still left over.

    Hmmm, I use it in several scripts to parse data from bank and IRS pdfs. Example
    pdftotext -layout $pdf_fn $TMPDIR/xx

    And, in the context of the OP's problem as he currently sees it,
    #!/bin/bash
    # Usage: lesspdfpipe.sh <pdf_file_name>
    if [ "$1" ]
    then
    TARGET=$(tempfile)
    if pdftotext -layout "$1" "$TARGET"
    then
    cat "$TARGET"
    fi
    fi

    Or, better yet

    #!/bin/bash
    # Usage: lesspdfpipe.sh <pdf_file_name>
    if [ "$1" ]
    then
    pdftotext -layout "$1" -
    fi

    Damn. Where did the pipe go? I thought I typed it.

    Or, minimally
    LESSOPEN='|pdftotext -layout %s -' \
    mate-terminal \
    --full-screen \
    -x /usr/bin/less -r some_random.pdf




    --
    Lew Pitcher
    "In Skills, We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Lew Pitcher on Thu Oct 20 17:11:10 2022
    On Thu, 20 Oct 2022 17:08:09 +0000, Lew Pitcher wrote:

    On Thu, 20 Oct 2022 17:06:05 +0000, Lew Pitcher wrote:

    On Thu, 20 Oct 2022 11:57:46 -0500, Bit Twister wrote:

    On Thu, 20 Oct 2022 15:37:51 +0000, Ottavio Caruso wrote:
    On 20/10/2022 15:25, Lew Pitcher wrote:


    FWIW, why aren't you using pdftotext(1), which expressly designed
    to extract and format the display text of a PDF file?

    Because pdftotext is not very scriptable and doesn't work as smoothly as >>>> "less -r", some control characters are still left over.

    Hmmm, I use it in several scripts to parse data from bank and IRS pdfs. Example
    pdftotext -layout $pdf_fn $TMPDIR/xx

    And, in the context of the OP's problem as he currently sees it,
    #!/bin/bash
    # Usage: lesspdfpipe.sh <pdf_file_name>
    if [ "$1" ]
    then
    TARGET=$(tempfile)
    if pdftotext -layout "$1" "$TARGET"
    then
    cat "$TARGET"
    fi
    fi

    Or, better yet

    #!/bin/bash
    # Usage: lesspdfpipe.sh <pdf_file_name>
    if [ "$1" ]
    then
    pdftotext -layout "$1" -
    fi

    Or, minimally
    LESSOPEN='pdftotext -layout %s -' \
    mate-terminal \
    --full-screen \
    -x /usr/bin/less -r some_random.pdf


    --
    Lew Pitcher
    "In Skills, We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bit Twister@21:1/5 to Ottavio Caruso on Thu Oct 20 11:57:46 2022
    On Thu, 20 Oct 2022 15:37:51 +0000, Ottavio Caruso wrote:
    On 20/10/2022 15:25, Lew Pitcher wrote:


    FWIW, why aren't you using pdftotext(1), which expressly designed
    to extract and format the display text of a PDF file?

    Because pdftotext is not very scriptable and doesn't work as smoothly as "less -r", some control characters are still left over.

    Hmmm, I use it in several scripts to parse data from bank and IRS pdfs. Example
    pdftotext -layout $pdf_fn $TMPDIR/xx

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Keith Thompson@21:1/5 to Keith Thompson on Thu Oct 20 10:23:10 2022
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    Ottavio Caruso <ottavio2006-usenet2012@yahoo.com> writes:
    I hope I can find the right language to make myself clear.

    I have accidentally discovered that I can read a pdf file from the
    command line just by using "less -r":

    $ less -r file.pdf

    works beautifully and even maintains some formatting.

    That's certainly not my experience.

    You probably have the $LESSOPEN environment variable set to invoke
    something like pdftotext on the input before displaying it.

    `man less` and search for "INPUT PREPROCESSOR".

    My shell is bash and my X terminal is mate-terminal:
    https://manpages.debian.org/bullseye/mate-terminal/mate-terminal.1.en.html >>
    If I launch from the terminal:
    $ mate-terminal --full-screen -x /usr/bin/less -r Downloads/08\ Karzoff.pdf >>
    This will open a new fullscreen window, rendering the pdf as intended.
    https://i.imgur.com/R2opBmj.png

    If I launch Caja (GUI file manager), right-click on the pdf > "open
    with other application" > "use a custom command > mate-terminal
    --full-screen -x /usr/bin/less -r > Open

    this will display garbage, that is the pdf source code:
    https://i.imgur.com/eAynhee.png

    Any clue?

    You probably have $LESSOPEN set in your interactive shell but not in
    your file manager. In your GUI command, try changing "/usr/bin/less -r"
    to "env LESSOPEN=... /usr/bin/less -r".

    Incidentally, I don't think the "-r" option is relevant here. That
    option causes less to display raw control characters. The output of
    pdftotext shouldn't contain any control characters.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    Working, but not speaking, for Philips
    void Void(void) { Void(); } /* The recursive call of the void */

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Spiros Bousbouras@21:1/5 to Lew Pitcher on Fri Oct 21 09:50:11 2022
    On Thu, 20 Oct 2022 16:10:40 -0000 (UTC)
    Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
    I don't use less(1) often, and didn't know about the LESSOPEN envar.

    Out of curiosity , do you read man pages using something other than
    less or you don't use man pages often ?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Spiros Bousbouras on Fri Oct 21 14:38:11 2022
    On Fri, 21 Oct 2022 09:50:11 +0000, Spiros Bousbouras wrote:

    On Thu, 20 Oct 2022 16:10:40 -0000 (UTC)
    Lew Pitcher <lew.pitcher@digitalfreehold.ca> wrote:
    I don't use less(1) often, and didn't know about the LESSOPEN envar.

    Out of curiosity , do you read man pages using something other than
    less or you don't use man pages often ?

    Up until now, I hadn't even considered what sort of pager man(1) uses.
    Thanks to your query, I now know that I /do/ use less(1) often, but
    not directly.

    Thanks for the hint :-)
    --
    Lew Pitcher
    "In Skills, We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Eli the Bearded@21:1/5 to spibou@gmail.com on Fri Oct 21 22:31:57 2022
    In comp.unix.shell, Spiros Bousbouras <spibou@gmail.com> wrote:
    Out of curiosity , do you read man pages using something other than
    less or you don't use man pages often ?

    I used to be a big fan of pg, but the hassle of installing it now has me
    using whatever default the system comes with.

    https://en.m.wikipedia.org/wiki/Pg_(Unix)

    Elijah
    ------
    also used to sometimes read man pages with lpr

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Eli the Bearded on Sat Oct 22 01:29:52 2022
    On 22.10.2022 00:31, Eli the Bearded wrote:

    I used to be a big fan of pg, but the hassle of installing it now has me using whatever default the system comes with.

    What?! 8-/

    I have it installed on my (sort of) "legacy" Linux that I mostly use;
    I think it was already there (no extra actions necessary). But now,
    after your post, I see that on another (newer) system it's not there.
    Has it - as I interpret your post - been removed from the distros? -
    Darn! - Why?

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)