• Text processing on VMS

    From David Meyer@21:1/5 to All on Mon Oct 14 00:04:25 2024
    I've got a text file with data that I want to select lines matching
    certain character strings, then extract string values from the selected
    lines by character position. On Unix, I would use awk or Perl. Does VMS
    have a similar tool, should I use my favorite programming language and
    call the STR$ RTL, can I write a TPU script to do this, or should I
    transfer the file to a Unix box and user awk or Perl? ;)

    --
    David Meyer
    Takarazuka, Japan
    papa@sdf.org

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Townley@21:1/5 to David Meyer on Sun Oct 13 16:20:20 2024
    On 13/10/2024 16:04, David Meyer wrote:
    I've got a text file with data that I want to select lines matching
    certain character strings, then extract string values from the selected
    lines by character position. On Unix, I would use awk or Perl. Does VMS
    have a similar tool, should I use my favorite programming language and
    call the STR$ RTL, can I write a TPU script to do this, or should I
    transfer the file to a Unix box and user awk or Perl? ;)


    I have often done this in this manner:

    Search file for matching records to second file.
    Read through in a DCL loop, and use f$locate to find the string, then
    f$extract to extract what I want, then write to a third file.

    --
    Chris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Craig A. Berry@21:1/5 to David Meyer on Sun Oct 13 12:48:10 2024
    On 10/13/24 10:04 AM, David Meyer wrote:
    I've got a text file with data that I want to select lines matching
    certain character strings, then extract string values from the selected
    lines by character position. On Unix, I would use awk or Perl. Does VMS
    have a similar tool, should I use my favorite programming language and
    call the STR$ RTL, can I write a TPU script to do this, or should I
    transfer the file to a Unix box and user awk or Perl? ;)

    Perl is available. It's part of the base install on OpenVMS x86. For
    Alpha and Itanium you can get an installer at:

    https://vmssoftware.com/products/perl/

    Python is also available. While what you want can be done with DCL or
    TPU, that's generally more pain for less gain.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to David Meyer on Sun Oct 13 14:39:48 2024
    On 10/13/2024 11:04 AM, David Meyer wrote:
    I've got a text file with data that I want to select lines matching
    certain character strings, then extract string values from the selected
    lines by character position. On Unix, I would use awk or Perl. Does VMS
    have a similar tool, should I use my favorite programming language and
    call the STR$ RTL, can I write a TPU script to do this, or should I
    transfer the file to a Unix box and user awk or Perl? ;)

    Both Perl and gawk are available for VMS.

    VSI distribute Perl - Alpha and Itanium here https://vmssoftware.com/products/perl/ - x86-64 I believe comes with VMS

    Gawk you can get from the net - https://vms.process.com/scripts/fileserv/fileserv.com?GAWK

    You can also use some other script language: Python, Groovy etc..

    (I like Groovy)

    A traditional VMS language (Cobol,Fortran,Basic,Pascal) and builtin
    string functionality or STR$ calls will likely be much more code.

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dave Froble@21:1/5 to All on Sun Oct 13 14:57:38 2024
    On 10/13/2024 2:39 PM, Arne Vajhøj wrote:
    On 10/13/2024 11:04 AM, David Meyer wrote:
    I've got a text file with data that I want to select lines matching
    certain character strings, then extract string values from the selected
    lines by character position. On Unix, I would use awk or Perl. Does VMS
    have a similar tool, should I use my favorite programming language and
    call the STR$ RTL, can I write a TPU script to do this, or should I
    transfer the file to a Unix box and user awk or Perl? ;)

    Both Perl and gawk are available for VMS.

    VSI distribute Perl - Alpha and Itanium here https://vmssoftware.com/products/perl/ - x86-64 I believe comes with VMS

    Gawk you can get from the net - https://vms.process.com/scripts/fileserv/fileserv.com?GAWK

    You can also use some other script language: Python, Groovy etc..

    (I like Groovy)

    A traditional VMS language (Cobol,Fortran,Basic,Pascal) and builtin
    string functionality or STR$ calls will likely be much more code.

    Arne


    Using SEARCH and then a simple Basic program is not that much work.

    For example:

    SEARCH File1.txt "some text" /output=File2.txt

    1 On Error Goto 90

    10 Open "file2" For Input as File 1%
    Open "File2" For Output as File 2%

    20 Linput #1%, Z$
    Print #2%, Mid(Z$,?,?)
    Goto 20

    90 GoTo 99 If ERR=11
    On Error GoTo 0

    99 End

    Simple
    No having to know whatever is your favorite utility
    I seriously doubt there would be much fewer characters

    No, I didn't try it ...

    --
    David Froble Tel: 724-529-0450
    Dave Froble Enterprises, Inc. E-Mail: davef@tsoft-inc.com
    DFE Ultralights, Inc.
    170 Grimplin Road
    Vanderbilt, PA 15486

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Hoffman@21:1/5 to David Meyer on Sun Oct 13 15:38:51 2024
    On 2024-10-13 15:04:25 +0000, David Meyer said:

    I've got a text file with data that I want to select lines matching
    certain character strings, then extract string values from the selected
    lines by character position. On Unix, I would use awk or Perl. Does VMS
    have a similar tool, should I use my favorite programming language and
    call the STR$ RTL, can I write a TPU script to do this, or should I
    transfer the file to a Unix box and user awk or Perl? ;)

    One-shot task?

    Haul it over to an existing and working Unix, and "awk to it" there.
    It'll be easier, generally. Easier, particularly if the text file
    contains UTF-8, though this file is from OpenVMS so probably not.

    For production use?

    Other folks have listed various options to perform this task on OpenVMS.

    Another option that seemingly wasn't mentioned is installing and using
    GNV on OpenVMS. Obligatory vim and emacs reference.

    The least-accretive and easiest-to-hand-off-to-others option on OpenVMS
    is approximately a DCL LOOP: / READ / WRITE / GOTO LOOP procedure.
    Which is a slog if unfamiliar with DCL, but is entirely doable.

    If this file has some sort of internal organization or syntax, use of a lib$table_parse grammar can be an option.



    --
    Pure Personal Opinion | HoffmanLabs LLC

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to Dave Froble on Sun Oct 13 15:26:03 2024
    On 10/13/2024 2:57 PM, Dave Froble wrote:
    On 10/13/2024 2:39 PM, Arne Vajhøj wrote:
    On 10/13/2024 11:04 AM, David Meyer wrote:
    I've got a text file with data that I want to select lines matching
    certain character strings, then extract string values from the selected
    lines by character position. On Unix, I would use awk or Perl. Does VMS
    have a similar tool, should I use my favorite programming language and
    call the STR$ RTL, can I write a TPU script to do this, or should I
    transfer the file to a Unix box and user awk or Perl? ;)

    Both Perl and gawk are available for VMS.

    VSI distribute Perl - Alpha and Itanium here
    https://vmssoftware.com/products/perl/ - x86-64 I believe comes with VMS

    Gawk you can get from the net -
    https://vms.process.com/scripts/fileserv/fileserv.com?GAWK

    You can also use some other script language: Python, Groovy etc..

    (I like Groovy)

    A traditional VMS language (Cobol,Fortran,Basic,Pascal) and builtin
    string functionality or STR$ calls will likely be much more code.

    Using SEARCH and then a simple Basic program is not that much work.

    For example:

    SEARCH File1.txt "some text" /output=File2.txt

    1    On Error Goto 90

    10    Open "file2" For Input as File 1%
        Open "File2" For Output as File 2%

    20    Linput #1%, Z$
        Print #2%, Mid(Z$,?,?)
        Goto 20

    90    GoTo 99 If ERR=11
        On Error GoTo 0

    99    End

    Simple
    No having to know whatever is your favorite utility
    I seriously doubt there would be much fewer characters

    No, I didn't try it ...

    I have confidence in your VMS Basic skills.

    :-)

    A compound solution of SEARCH and a program is an option.

    But in a relevant script language then it should be a one statement
    problem (although in most cases splitting that one statement over
    multiple lines is a good thing for readability).

    import java.nio.file.*

    Files.lines(Paths.get("login.com"))
    .filter(line -> line.contains("java"))
    .map(line -> line[2..12])
    .forEach(System.out::println)

    output pos 2..12 (pos is 0 based!) from all lines of login.com
    that contains "java".

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Craig A. Berry@21:1/5 to All on Sun Oct 13 17:24:14 2024
    On 10/13/24 2:26 PM, Arne Vajhøj wrote:

    But in a relevant script language then it should be a one statement
    problem (although in most cases splitting that one statement over
    multiple lines is a good thing for readability).

    import java.nio.file.*

    Files.lines(Paths.get("login.com"))
         .filter(line -> line.contains("java"))
         .map(line -> line[2..12])
         .forEach(System.out::println)

    output pos 2..12 (pos is 0 based!) from all lines of login.com
    that contains "java".

    Opening an editor, typing all that in, running the java compiler, and
    then running the compiled program all seems like a lot of work to me
    when all you need to do is:

    $ perl -nE "say substr($_, 2, 12) if $_ =~ m/java/i;" < login.com

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to Craig A. Berry on Sun Oct 13 20:14:21 2024
    On 10/13/2024 6:24 PM, Craig A. Berry wrote:
    On 10/13/24 2:26 PM, Arne Vajhøj wrote:
    But in a relevant script language then it should be a one statement
    problem (although in most cases splitting that one statement over
    multiple lines is a good thing for readability).

    import java.nio.file.*

    Files.lines(Paths.get("login.com"))
          .filter(line -> line.contains("java"))
          .map(line -> line[2..12])
          .forEach(System.out::println)

    output pos 2..12 (pos is 0 based!) from all lines of login.com
    that contains "java".

    Opening an editor, typing all that in, running the java compiler, and
    then running the compiled program all seems like a lot of work to me
    when all you need to do is:

    $ perl -nE "say substr($_, 2, 12) if $_ =~ m/java/i;" < login.com

    It is not Java but Groovy, so compile is optional.

    And groovysh does have an -e for evaluating code given in command line
    (it is just rarely used).

    But you are absolutely right: Perl code is shorter than Groovy code.

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to All on Sun Oct 13 20:39:08 2024
    On 10/13/2024 8:14 PM, Arne Vajhøj wrote:
    It is not Java but Groovy, so compile is optional.

    The difference is all in the wrapping though.

    $ type P.java
    import java.nio.file.Files;
    import java.nio.file.Paths;

    public class P {
    public static void main(String[] args) throws Exception {
    Files.lines(Paths.get("login.com"))
    .filter(line -> line.contains("java"))
    .map(line -> line.substring(2, 13))
    .forEach(System.out::println);
    }
    }
    $ type s.groovy
    import java.nio.file.*

    Files.lines(Paths.get("login.com"))
    .filter(line -> line.contains("java"))
    .map(line -> line[2..12])
    .forEach(System.out::println)
    $ javac P.java
    $ java P
    ...
    $ groovy s.groovy
    ...

    (I don't even have groovysh defined on VMS, so no easy way to try -e)

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to All on Sun Oct 13 20:47:11 2024
    $ type s.groovy
    import java.nio.file.*

    Files.lines(Paths.get("login.com"))
         .filter(line -> line.contains("java"))
         .map(line -> line[2..12])
         .forEach(System.out::println)

    Note that it is probably more groovysk with:

    $ type s2.groovy
    import java.nio.file.*

    Files.lines(Paths.get("login.com"))
    .filter({ it.contains("java") })
    .map({ it[2..12] })
    .forEach({ println(it) })

    But I don't think that improves readability.

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to All on Sun Oct 13 20:51:12 2024
    On 10/13/2024 8:47 PM, Arne Vajhøj wrote:
    $ type s.groovy
    import java.nio.file.*

    Files.lines(Paths.get("login.com"))
          .filter(line -> line.contains("java"))
          .map(line -> line[2..12])
          .forEach(System.out::println)

    Note that it is probably more groovysk with:

    $ type s2.groovy
    import java.nio.file.*

    Files.lines(Paths.get("login.com"))
         .filter({ it.contains("java") })
         .map({ it[2..12] })
         .forEach({ println(it) })

    But I don't think that improves readability.

    Or:

    $ type s3.groovy
    import java.nio.file.*

    Files.lines(Paths.get("login.com"))
    .filter({ it.contains("java") })
    .map({ it[2..12] })
    .each(this.&println)

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Meyer@21:1/5 to All on Mon Oct 14 10:23:23 2024
    SEARCH/DCL/SORT for the VMS win!

    Thanks for all suggestions. Highly educational.
    --
    David Meyer
    Takarazuka, Japan
    papa@sdf.org

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to All on Sun Oct 13 22:35:20 2024
    On 10/13/2024 3:26 PM, Arne Vajhøj wrote:
    On 10/13/2024 2:57 PM, Dave Froble wrote:
    Using SEARCH and then a simple Basic program is not that much work.

    For example:

    SEARCH File1.txt "some text" /output=File2.txt

    1    On Error Goto 90

    10    Open "file2" For Input as File 1%
         Open "File2" For Output as File 2%

    20    Linput #1%, Z$
         Print #2%, Mid(Z$,?,?)
         Goto 20

    90    GoTo 99 If ERR=11
         On Error GoTo 0

    99    End

    Simple
    No having to know whatever is your favorite utility
    I seriously doubt there would be much fewer characters

    No, I didn't try it ...

    I have confidence in your VMS Basic skills.

    But I am curious about how you iterate over the file.

    Are there any benefits from this way compared to:

    handler eof_handler
    end handler
    when error use eof_handler
    while 1 = 1
    get #1
    ! do whatever
    next
    end when

    ?

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dave Froble@21:1/5 to All on Mon Oct 14 09:01:49 2024
    On 10/13/2024 10:35 PM, Arne Vajhøj wrote:
    On 10/13/2024 3:26 PM, Arne Vajhøj wrote:
    On 10/13/2024 2:57 PM, Dave Froble wrote:
    Using SEARCH and then a simple Basic program is not that much work.

    For example:

    SEARCH File1.txt "some text" /output=File2.txt

    1 On Error Goto 90

    10 Open "file2" For Input as File 1%
    Open "File2" For Output as File 2%

    20 Linput #1%, Z$
    Print #2%, Mid(Z$,?,?)
    Goto 20

    90 GoTo 99 If ERR=11
    On Error GoTo 0

    99 End

    Simple
    No having to know whatever is your favorite utility
    I seriously doubt there would be much fewer characters

    No, I didn't try it ...

    I have confidence in your VMS Basic skills.

    But I am curious about how you iterate over the file.

    Are there any benefits from this way compared to:

    handler eof_handler
    end handler
    when error use eof_handler
    while 1 = 1
    get #1
    ! do whatever
    next
    end when

    ?

    Arne


    Yes, I like to keep things very simple, otherwise it hurts my brain ...

    If I still have one, not sure ...


    --
    David Froble Tel: 724-529-0450
    Dave Froble Enterprises, Inc. E-Mail: davef@tsoft-inc.com
    DFE Ultralights, Inc.
    170 Grimplin Road
    Vanderbilt, PA 15486

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon Clubley@21:1/5 to arne@vajhoej.dk on Mon Oct 14 12:30:19 2024
    On 2024-10-13, Arne Vajhøj <arne@vajhoej.dk> wrote:
    On 10/13/2024 3:26 PM, Arne Vajhøj wrote:
    On 10/13/2024 2:57 PM, Dave Froble wrote:
    Using SEARCH and then a simple Basic program is not that much work.

    For example:

    SEARCH File1.txt "some text" /output=File2.txt

    1    On Error Goto 90

    10    Open "file2" For Input as File 1%
         Open "File2" For Output as File 2%

    20    Linput #1%, Z$
         Print #2%, Mid(Z$,?,?)
         Goto 20

    90    GoTo 99 If ERR=11
         On Error GoTo 0

    99    End

    Simple
    No having to know whatever is your favorite utility
    I seriously doubt there would be much fewer characters

    No, I didn't try it ...

    I have confidence in your VMS Basic skills.

    But I am curious about how you iterate over the file.

    Are there any benefits from this way compared to:

    handler eof_handler
    end handler
    when error use eof_handler
    while 1 = 1
    get #1
    ! do whatever
    next
    end when


    That's how a Pascal programmer would write it. David however clearly
    prefers Dartmouth Basic. :-)

    BTW, I think your approach is a lot more readable than David's style. :-)

    Simon.

    --
    Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
    Walking destinations on a map are further away than they appear.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to Simon Clubley on Mon Oct 14 09:56:04 2024
    On 10/14/2024 8:30 AM, Simon Clubley wrote:
    On 2024-10-13, Arne Vajhøj <arne@vajhoej.dk> wrote:
    On 10/13/2024 3:26 PM, Arne Vajhøj wrote:
    On 10/13/2024 2:57 PM, Dave Froble wrote:
    Using SEARCH and then a simple Basic program is not that much work.

    For example:

    SEARCH File1.txt "some text" /output=File2.txt

    1    On Error Goto 90

    10    Open "file2" For Input as File 1%
         Open "File2" For Output as File 2%

    20    Linput #1%, Z$
         Print #2%, Mid(Z$,?,?)
         Goto 20

    90    GoTo 99 If ERR=11
         On Error GoTo 0

    99    End

    Simple
    No having to know whatever is your favorite utility
    I seriously doubt there would be much fewer characters

    No, I didn't try it ...

    I have confidence in your VMS Basic skills.

    But I am curious about how you iterate over the file.

    Are there any benefits from this way compared to:

    handler eof_handler
    end handler
    when error use eof_handler
    while 1 = 1
    get #1
    ! do whatever
    next
    end when


    That's how a Pascal programmer would write it. David however clearly
    prefers Dartmouth Basic. :-)

    BTW, I think your approach is a lot more readable than David's style. :-)

    But I have not come up with that construct. I must have gotten
    it from somewhere. Just not sure where.

    BTW, I think it would be nice if the compiler wizard added
    either:

    while not eof #1
    get #1
    ' do whatever
    next

    or:

    while true
    get #1, eof=100
    ' do whatever
    next
    100:

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)