• Using expr for regexp match

    From Janis Papanagnou@21:1/5 to All on Sat Jan 15 15:47:54 2022
    I want to do some regexp matches on strings with expr, a tool I rarely
    used, and I'm puzzled about its correct usage for the intended purpose,
    if it's appropriate at all.

    The test cases I am actually trying to implement are (in terms of grep)

    $ grep -o "%[auUlLcCsS%]" <<< "A%aB%bC%cX"
    %a
    %c

    $ grep -o "%[^auUlLcCsS%]" <<< "A%aB%bC%cX"
    %b

    or in terms of awk

    $ echo "A%aB%bC%cX" |
    awk '{ print match($0,/%[^auUlLcCsS%]/)
    print substr($0,RSTART,RLENGTH) }'
    5
    %b

    If possible I'd want either the substring or the index of the match in
    the string and I thought that expr might serve as a light-weight tool
    to avoid grep or awk.

    The POSIX specs say: "Usually, the matching operator shall return a
    string representing the number of characters matched ('0' on failure)."
    so it might not be the appropriate tool. - Or am I missing something?

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Axel Reichert on Sat Jan 15 18:07:11 2022
    On 15.01.2022 17:40, Axel Reichert wrote:

    I got concerned by "achored", tested

    expr "A%aB%bC%cX" : "%[^auUlLcCsS%]"

    and got a "0" returned, which is not what you want.

    Yep, that was also my first test and the result puzzled me.


    expr "A%aB%bC%cX" : ".*\(%[^auUlLcCsS%]\)"

    But this one is fine. I forgot about the subexpression semantics
    and obviously was inattentive when re-reading the man page.

    returns "%b", though. So your mileage may vary.

    Thanks!

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Axel Reichert@21:1/5 to Janis Papanagnou on Sat Jan 15 17:40:57 2022
    Janis Papanagnou <janis_papanagnou@hotmail.com> writes:

    $ echo "A%aB%bC%cX" |
    awk '{ print match($0,/%[^auUlLcCsS%]/)
    print substr($0,RSTART,RLENGTH) }'
    5
    %b

    My (BSD) man page mentions

    expr1 : expr2
    The ``:'' operator matches expr1 against expr2, which must be a basic
    regular expression. The regular expression is anchored to the begin-
    ning of the string with an implicit ``^''.

    If the match succeeds and the pattern contains at least one regular
    expression subexpression ``\(...\)'', the string corresponding to
    ``\1'' is returned; otherwise the matching operator returns the num-
    ber of characters matched. If the match fails and the pattern con-
    tains a regular expression subexpression the null string is returned;
    otherwise 0.

    I got concerned by "achored", tested

    expr "A%aB%bC%cX" : "%[^auUlLcCsS%]"

    and got a "0" returned, which is not what you want.

    expr "A%aB%bC%cX" : ".*\(%[^auUlLcCsS%]\)"

    returns "%b", though. So your mileage may vary.

    Best regards

    Axel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Christian Weisgerber@21:1/5 to Janis Papanagnou on Sat Jan 15 18:22:44 2022
    On 2022-01-15, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:

    I want to do some regexp matches on strings with expr, a tool I rarely
    used, and I'm puzzled about its correct usage for the intended purpose,

    There are two properties to be aware of:
    * expr(1)'s match operator uses a _basic_ regular expression.
    * The regular expression is anchored to the beginning of the string
    with an implicit '^'.

    If possible I'd want either the substring or the index of the match in
    the string and I thought that expr might serve as a light-weight tool
    to avoid grep or awk.

    You can extract a substring with \(...\).

    --
    Christian "naddy" Weisgerber naddy@mips.inka.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)