• (GAWK) fatal: attempt to use scalar 'x' as an array

    From Kenny McCormack@21:1/5 to All on Thu Aug 12 10:54:12 2021
    Observe:

    $ gawk4 '{ x[$1] = $0 } END { if (!length(x)) print "Nothing in the array";for (i in x) print i,x[i] }'
    Nothing in the array
    gawk4: cmd. line:1: fatal: attempt to use scalar `x' as an array
    $

    This happens because I hit ^D (eof) as the first input to this program.

    I can "fix" this issue by either:
    1) Entering at least one valid line of input before hitting EOF.
    or
    2) Reversing the order of the two clauses in the END section.

    Now, of course, it is clear what is going on here - and why the "fixes"
    work. But what surprises me is that it (the error) happens at all. My understanding had been that the issue (i.e., dark corner in the GAWK
    language) of whether something is an array or a scalar is resolved entirely
    at compile time. That basically the context of a variable's first
    occurrence determined its type (i.e., array or not). So:
    1) Note that the first occurrence of x is in the assignment, so you'd
    think it would get array'ized there.
    2) I find it strange that the runtime execution path (whether or not
    the assignment part ever executes) ends up determining the type of 'x'.

    I find this interesting. It would be nice if this dark corner could be
    removed from (fixed in) the language.

    Note, incidentally, that TAWK is better here (as it [almost] always is).
    This dark corner does not exist in TAWK. A variable can have multiple
    types in the course of a program.

    --
    To most Christians, the Bible is like a software license. Nobody
    actually reads it. They just scroll to the bottom and click "I agree."

    - author unknown -

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Kenny McCormack on Thu Aug 12 14:17:12 2021
    On 12.08.2021 12:54, Kenny McCormack wrote:
    Observe:

    $ gawk4 '{ x[$1] = $0 } END { if (!length(x)) print "Nothing in the array";for (i in x) print i,x[i] }'
    Nothing in the array
    gawk4: cmd. line:1: fatal: attempt to use scalar `x' as an array
    $

    This happens because I hit ^D (eof) as the first input to this program.

    I can "fix" this issue by either:
    1) Entering at least one valid line of input before hitting EOF.
    or
    2) Reversing the order of the two clauses in the END section.

    or
    3) Force x to become an array.


    I recall a similar question from mine a couple months ago (you were
    amongst the first responders, BTW), and my try on a solution was code
    like the one in the BEGIN clause added to your test case:

    gawk '
    BEGIN { x["dummy"] ; delete x["dummy"] }
    { x[$1] = $0 }
    END { if (!length(x)) print "Nothing in the array"
    for (i in x) print i,x[i]
    }
    '


    Now, of course, it is clear what is going on here - and why the "fixes"
    work. But what surprises me is that it (the error) happens at all. My understanding had been that the issue (i.e., dark corner in the GAWK language) of whether something is an array or a scalar is resolved entirely at compile time.

    I don't think so. I think it's a pure runtime issue.

    Janis

    [...]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Janis Papanagnou on Thu Aug 12 15:52:23 2021
    On 12.08.2021 14:17, Janis Papanagnou wrote:

    or
    3) Force x to become an array.

    I just recall that, I think it was Ed who suggested a less bulky form.

    BEGIN { delete x[""] }

    works, and it seems

    BEGIN { delete x }

    works as well (at least in my GNU Awk context).


    gawk '
    BEGIN { x["dummy"] ; delete x["dummy"] }
    { x[$1] = $0 }
    END { if (!length(x)) print "Nothing in the array"
    for (i in x) print i,x[i]
    }
    '

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to janis_papanagnou@hotmail.com on Thu Aug 12 17:27:00 2021
    In article <sf392n$gnd$1@news-1.m-online.net>,
    Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
    On 12.08.2021 14:17, Janis Papanagnou wrote:

    or
    3) Force x to become an array.

    I just recall that, I think it was Ed who suggested a less bulky form.

    BEGIN { delete x[""] }

    works, and it seems

    BEGIN { delete x }

    works as well (at least in my GNU Awk context).

    So, the point is, it really does just boil down to: You have to ensure
    that, whatever execution path your program takes, the first runtime
    reference to the variable is an unequivocally array context.

    It strikes me that it might be a good thing for GAWK to have a "declare" statement - that would allow you to state up front that something is an
    array. Bash has this now, and it is actually quite useful.

    --
    A racist, a Nazi, and a Klansman walk into a bar...

    Bartender says, "What will it be, Mr. Trump?"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From J Naman@21:1/5 to Kenny McCormack on Thu Aug 12 11:17:05 2021
    On Thursday, 12 August 2021 at 13:27:01 UTC-4, Kenny McCormack wrote:
    In article <sf392n$gnd$1...@news-1.m-online.net>,
    Janis Papanagnou <janis_pa...@hotmail.com> wrote:
    On 12.08.2021 14:17, Janis Papanagnou wrote:

    or
    3) Force x to become an array.

    I just recall that, I think it was Ed who suggested a less bulky form.

    BEGIN { delete x[""] }

    works, and it seems

    BEGIN { delete x }

    works as well (at least in my GNU Awk context).
    So, the point is, it really does just boil down to: You have to ensure
    that, whatever execution path your program takes, the first runtime reference to the variable is an unequivocally array context.

    It strikes me that it might be a good thing for GAWK to have a "declare" statement - that would allow you to state up front that something is an array. Bash has this now, and it is actually quite useful.

    --
    A racist, a Nazi, and a Klansman walk into a bar...

    Bartender says, "What will it be, Mr. Trump?"
    Kenny McCormack: Would you PLEASE stop inserting political content into our conversations about the awk language! I don't care what ideology people espouse, I participate in this group to NOT have politics and ideology intrude on my conversations. Take
    your flames to Twitter ...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Kenny McCormack on Fri Aug 13 03:03:03 2021
    On 12.08.2021 19:27, Kenny McCormack wrote:
    In article <sf392n$gnd$1@news-1.m-online.net>,
    Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
    On 12.08.2021 14:17, Janis Papanagnou wrote:

    or
    3) Force x to become an array.

    I just recall that, I think it was Ed who suggested a less bulky form.

    BEGIN { delete x[""] }

    works, and it seems

    BEGIN { delete x }

    works as well (at least in my GNU Awk context).

    So, the point is, it really does just boil down to: You have to ensure
    that, whatever execution path your program takes, the first runtime
    reference to the variable is an unequivocally array context.

    It strikes me that it might be a good thing for GAWK to have a "declare" statement - that would allow you to state up front that something is an array. Bash has this now, and it is actually quite useful.

    Well, part of the beauty of Awk is it's terseness, here the fact that
    you don't need declarations. Of course the feature could be optional,
    but then you'd have to introduce another keyword, something language
    designers usually want to avoid. Declarations are especially useful
    where a lot of data structuring features are present. GNU Awk started
    to enter that path already by the support of multi-dimensional arrays,
    so maybe, depending on any further plans to introduce yet more data
    structuring features, a 'declare' might eventually be the consequence.
    For the current primitive vs. compound data type dichotomy it's likely
    just overkill, especially since there's a code pattern to address that
    issue.

    (It's a bit different in shells, with Bash's declare or Ksh's typeset;
    typeset, for example, is a much more powerful concept than a simple array/string/numeric declaration.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to J Naman on Fri Aug 13 02:48:15 2021
    On 12.08.2021 20:17, J Naman wrote:
    On Thursday, 12 August 2021 at 13:27:01 UTC-4, Kenny McCormack wrote:
    [...]
    Kenny McCormack: Would you PLEASE stop inserting political content into our conversations about the awk language! I don't care what ideology people espouse, I participate in this group to NOT have politics and ideology intrude on my conversations. Take
    your flames to Twitter ...

    J Naman, please comply to the Usenet posting standards yourself - here:
    line length - before you even try to ask others to change their posting
    habits.

    This is Usenet, not a web forum or a Google forum. If you'd been using
    a Real Newsreader you'd certainly be better aware of that fact. Then
    you'd also see that the text you were addressing was part of a randomly generated signature, not part of the topical post/conversation. (A real newsreader would make that quite obvious, BTW; signatures are displayed differently and are not inserted in quoted replies, for example.)

    Since we're at it, Laurent, you too, get informed about Usenet and try complying to Netiquette; post context. Get informed!

    Other Google users flooding Usenet newsgroups should check the Usenet Netiquette as well before continuing to post. It's sad that even long
    time regulars here that switched to the Google interface forgot about
    where they are and about the Netiquette.

    I'm too old to really care, but if all those Googlies (Google Usenet-
    newbies) start to try enforcing their own rules while not complying to
    long time existing Usenet Netiquette it's time to speak up.

    (Google users will know how to use the [Google] search engine to find
    the Netiquette and information about Usenet newsgroups, so I abstain
    from doing their homework and providing the link.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrew Schorr@21:1/5 to Kenny McCormack on Thu Aug 12 18:34:20 2021
    On Thursday, August 12, 2021 at 6:54:14 AM UTC-4, Kenny McCormack wrote:
    Observe:

    $ gawk4 '{ x[$1] = $0 } END { if (!length(x)) print "Nothing in the array";for (i in x) print i,x[i] }'
    Nothing in the array
    gawk4: cmd. line:1: fatal: attempt to use scalar `x' as an array
    $

    As others have pointed out, adding 'delete x' essentially declares it as an array.
    That being said, there seems to be a patch in the development tree that fixes this issue.
    In gawk 5.1.0:

    bash-4.2$ ./gawk '{ x[$1] = $0 } END { if (!length(x)) print "Nothing in the array";for (i in x) print i,x[i]; print typeof(x) }' < /dev/null
    Nothing in the array
    gawk: cmd. line:1: fatal: attempt to use scalar `x' as an array

    In the master branch:

    bash-4.2$ ./gawk '{ x[$1] = $0 } END { if (!length(x)) print "Nothing in the array";for (i in x) print i,x[i]; print typeof(x) }' < /dev/null
    Nothing in the array
    array

    So this problem may eventually go away. But it is safest to say 'delete x' to avoid ambiguity.

    Regards,
    Andy

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to aschorr@telemetry-investments.com on Fri Aug 13 02:47:40 2021
    In article <d3603a3b-96fb-441c-9032-e06ad93d7fddn@googlegroups.com>,
    Andrew Schorr <aschorr@telemetry-investments.com> wrote:
    ...
    In the master branch:

    bash-4.2$ ./gawk '{ x[$1] = $0 } END { if (!length(x)) print "Nothing in the >array";for (i in x) print i,x[i]; print typeof(x) }' < /dev/null
    Nothing in the array
    array

    This is good. I am glad to see that it is being worked on.

    I think we can all agree that while it is not a big deal in the grand
    scheme of things, and we all know by now how to workaround it, it is in the category of "surprising" and it would be better if it didn't happen.

    So this problem may eventually go away. But it is safest to say 'delete x' to >avoid ambiguity.

    For me, it was easiest to just reverse the order of the two clauses in the
    END section (*). The idea of putting a BEGIN clause in (which my program
    does not currently have) and to put the obscure incantation of "delete x"
    there seems odd. Not necessarily bad, but odd.

    (*) It was actually pretty much just by accident that I coded it in that
    order originally. It could just as easily have been coded the other way
    from the start.

    --
    The randomly chosen signature file that would have appeared here is more than 4 lines long. As such, it violates one or more Usenet RFCs. In order to remain in compliance with said RFCs, the actual sig can be found at the following URL:
    http://user.xmission.com/~gazelle/Sigs/FreeCollege

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Aharon Robbins@21:1/5 to aschorr@telemetry-investments.com on Fri Aug 13 19:56:06 2021
    In article <d3603a3b-96fb-441c-9032-e06ad93d7fddn@googlegroups.com>,
    Andrew Schorr <aschorr@telemetry-investments.com> wrote:
    In the master branch:

    bash-4.2$ ./gawk '{ x[$1] = $0 } END { if (!length(x)) print "Nothing in
    the array";for (i in x) print i,x[i]; print typeof(x) }' < /dev/null
    Nothing in the array
    array

    So this problem may eventually go away. But it is safest to say 'delete
    x' to avoid ambiguity.

    Using 'delete x', or some other way to force x to be an array is the
    most portable thing to do.

    The fix in the development branch is that `length(x)' on a never
    assigned value no longer forces that value to be a scalar, but leaves
    it as undefined, and returns 0, which is correct both for undefined
    scalars and undefined arrays.

    This will be included in the next release.
    --
    Aharon (Arnold) Robbins arnold AT skeeve DOT com

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Aharon Robbins@21:1/5 to janis_papanagnou@hotmail.com on Fri Aug 13 19:51:41 2021
    In article <sf33g8$f3s$1@news-1.m-online.net>,
    Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
    Now, of course, it is clear what is going on here - and why the "fixes"
    work. But what surprises me is that it (the error) happens at all. My
    understanding had been that the issue (i.e., dark corner in the GAWK
    language) of whether something is an array or a scalar is resolved entirely >> at compile time.

    I don't think so. I think it's a pure runtime issue.

    This is correct, it is a pure runtime issue.
    --
    Aharon (Arnold) Robbins arnold AT skeeve DOT com

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)