• [gentoo-user] script help - removing newlines

    From Adam Carter@21:1/5 to All on Wed Dec 8 07:20:02 2021
    I have text files that are sometimes;

    property "something"

    comment "whatever"



    but sometimes there are newline characters in the comment field;

    property "something"

    comment "something

    something else

    a third thing"



    I want to replace any newlines between 'comment "' and the next '"' with
    spaces so the whole comment is on a single line. How can it be done?

    <div dir="ltr"><p class="MsoNormal">I have text files that are sometimes;<span></span></p>
    <p class="MsoNormal">property &quot;something&quot;<span></span></p>
    <p class="MsoNormal">comment &quot;whatever&quot;<span></span></p>
    <p class="MsoNormal"><span> </span></p>
    <p class="MsoNormal">but sometimes there are newline characters in the comment field;<span></span></p>
    <p class="MsoNormal">property &quot;something&quot;<span></span></p>
    <p class="MsoNormal">comment &quot;something<span></span></p>
    <p class="MsoNormal">something else<span></span></p>
    <p class="MsoNormal">a third thing&quot;<span></span></p>
    <p class="MsoNormal"><span> </span></p>
    <p class="MsoNormal">I want to replace any newlines between &#39;comment &quot;&#39; and the next &#39;&quot;&#39; with spaces so the whole comment is on a single line. How can it be done?<br></p></div>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael Orlitzky@21:1/5 to Adam Carter on Wed Dec 8 14:00:02 2021
    On 2021-12-08 17:15:43, Adam Carter wrote:

    but sometimes there are newline characters in the comment field;

    property "something"

    comment "something

    something else

    a third thing"

    I want to replace any newlines between 'comment "' and the next '"' with spaces so the whole comment is on a single line. How can it be done?

    It depends on how complicated the format of your file can be. For the
    one example shown, you could write a python script that looks for,

    comment "

    at the beginning of a line, and then scans forward one character at a
    time, looking for the ending quotation mark, but deleting any newlines
    it finds along the way. Things like this work great until they meet
    the real world:

    1. Are you sure there's always exactly one space between the word
    'comment' and the quotation mark?

    2. Can there be space before the word 'comment'?

    3. Are you sure nobody is using single quotes instead of double quotes?
    How about the fancy non-ascii quotes that you get sometimes when
    copy/pasting from a webpage or a Word document?

    4. What happens if a comment contains double-quotes?

    etc. If you're the one creating the data or if you're sure that the
    format will be exactly what you say it is, then you can get away with
    a simple script. (Otherwise, the answer is basically "write a parser.")

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tavis Ormandy@21:1/5 to Adam Carter on Wed Dec 8 23:20:01 2021
    On 2021-12-08, Adam Carter wrote:

    I want to replace any newlines between 'comment "' and the next '"' with spaces so the whole comment is on a single line. How can it be done?


    Hmm, maybe:

    $ awk '/^comment "[^"]*$/ { ORS=" " } /"$/ { ORS="\n" } { print }' yourfile.txt

    That means if a line starts with 'comment "' but doesn't end with ",
    change the ORS (output record seperator) to a space. If it does end with
    a ", change it to a newline.

    Tavis.

    --
    _o) $ lynx lock.cmpxchg8b.com
    /\\ _o) _o) $ finger taviso@sdf.org
    _\_V _( ) _( ) @taviso

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Adam Carter@21:1/5 to All on Thu Dec 9 06:10:02 2021

    Hmm, maybe:

    $ awk '/^comment "[^"]*$/ { ORS=" " } /"$/ { ORS="\n" } { print }' yourfile.txt


    Yes that works (after piping the file thru tr -d '\r' first). Thanks!

    <div dir="ltr"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    Hmm, maybe:<br>

    $ awk &#39;/^comment &quot;[^&quot;]*$/ { ORS=&quot; &quot; } /&quot;$/ { ORS=&quot;\n&quot; } { print }&#39; yourfile.txt<br></blockquote><div><br></div><div>Yes that works (after piping the file thru tr -d &#39;\r&#39; first). Thanks! <br></div></div></


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)