• Odd bug

    From Mary Karr@21:1/5 to All on Mon May 29 08:06:32 2023
    This code prints "yes" in 4.1

    awk 'BEGIN{token = "firstname"; if(token ~ (/^lastname|^firstname/)) {print "yes"} }'

    ..and in 5.1 prints nothing

    The code is user error, there should be no parens around the //.

    It's confusing why the versions have different results. Did 4.1 auto-detect the problem?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to karrmary922@gmail.com on Mon May 29 16:24:48 2023
    In article <c4685425-7035-4a1d-8948-8c1a8bbe2b9cn@googlegroups.com>,
    Mary Karr <karrmary922@gmail.com> wrote:
    This code prints "yes" in 4.1

    awk 'BEGIN{token = "firstname"; if(token ~ (/^lastname|^firstname/)) {print "yes"} }'

    ..and in 5.1 prints nothing

    Confirmed. In 4.1.4, it prints "yes"; in 5.0.1 it prints nothing.

    The code is user error, there should be no parens around the //.

    It's confusing why the versions have different results. Did 4.1
    auto-detect the problem?

    Actually, it looks like it is a bug fixed in 5.x.
    The previous (4.x) behavior is actually incorrect.

    Here's my explanation (which you may or may not already know). Basically,
    AWK (specifically, GAWK) tries very hard to come up with a legal
    interpretation of whatever you throw at it. Kind of like when a human
    says something nonsensical to you; you try very hard to come up with an interpretation of what they said that makes sense. Only if that is not possible do you tell them that they are speaking gibberish.

    What is happening is that when you put a bare regular expression in parens,
    it then gets evaluated as either 0 or 1, depending on whether or not the
    reg exp matches (i.e., is contained in) $0. The following demonstrates how
    to change your program so that a current version of GAWK will print "yes":

    $ /usr/bin/gawk 'BEGIN{$0="lastname";token = "first1ame"; if(token ~ (/^lastname|^firstname/)) {print "yes"} }'
    yes
    $

    --
    Mike Huckabee has yet to consciously uncouple from Josh Duggar.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Kenny McCormack on Mon May 29 18:33:07 2023
    On 2023-05-29, Kenny McCormack <gazelle@shell.xmission.com> wrote:
    What is happening is that when you put a bare regular expression in parens, it then gets evaluated as either 0 or 1, depending on whether or not the
    reg exp matches (i.e., is contained in) $0. The following demonstrates how to change your program so that a current version of GAWK will print "yes":

    Awk is "weird" that way. Ostensibly, it is based on C, but brings
    in various quirks, includiing whitespace sensitivities in the middle
    of expressions.

    In C, parentheses are sanely behaved. Given:

    int x = 42;

    These are valid:

    (x)++;

    (x) = (x) + 1;

    Parentheses influence associativity and precedence. Other than that,
    (x) is exactly the same as x to the point that if x is an lvalue,
    so is x, and we can "increment through" and "assign through"
    the parentheses.

    Not so in Awk, or not all the time.

    For instance, try this

    BEGIN { (x) = 42; }

    Nope! No "assigning through parentheses".

    You cannot randomly sprinkle parentheses around expression terms in Awk.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to 864-117-4973@kylheku.com on Mon May 29 22:20:53 2023
    In article <20230529111746.484@kylheku.com>,
    Kaz Kylheku <864-117-4973@kylheku.com> wrote:
    On 2023-05-29, Kenny McCormack <gazelle@shell.xmission.com> wrote:
    What is happening is that when you put a bare regular expression in parens, >> it then gets evaluated as either 0 or 1, depending on whether or not the
    reg exp matches (i.e., is contained in) $0. The following demonstrates how >> to change your program so that a current version of GAWK will print "yes":

    Awk is "weird" that way. Ostensibly, it is based on C, but brings
    in various quirks, includiing whitespace sensitivities in the middle
    of expressions.

    Note, BTW, that if you change the slashes to quotes - i.e., put:

    token ~ ("^lastname|^firstname"))

    the problem goes away (though other problems may appear).

    --
    The key difference between faith and science is that in science, evidence that doesn't fit the theory tends to weaken the theory (that is, make it less likely to
    be believed), whereas in faith, contrary evidence just makes faith stronger (on the assumption that Satan is testing you - trying to make you abandon your faith).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Kenny McCormack on Tue May 30 02:09:26 2023
    On 30.05.2023 00:20, Kenny McCormack wrote:
    In article <20230529111746.484@kylheku.com>,
    Kaz Kylheku <864-117-4973@kylheku.com> wrote:
    On 2023-05-29, Kenny McCormack <gazelle@shell.xmission.com> wrote:
    What is happening is that when you put a bare regular expression in parens, >>> it then gets evaluated as either 0 or 1, depending on whether or not the >>> reg exp matches (i.e., is contained in) $0. The following demonstrates how >>> to change your program so that a current version of GAWK will print "yes": >>
    Awk is "weird" that way. Ostensibly, it is based on C, but brings
    in various quirks, includiing whitespace sensitivities in the middle
    of expressions.

    Note, BTW, that if you change the slashes to quotes - i.e., put:

    token ~ ("^lastname|^firstname"))

    the problem goes away (though other problems may appear).

    Other problems than those inherent to dynamic regexps? - Looks fine to me.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)