• asort function

    From Ed Morton@21:1/5 to Laurent MANCHON on Wed Aug 11 09:30:23 2021
    On 8/11/2021 9:20 AM, Laurent MANCHON wrote:
    --Hi all,

    is asort() function exist in TAWK (or only in gawk) ?

    thx


    I don't know the answer to that but I'm curious about why you're asking.
    If you have tawk you can just try calling asort() and don't need to ask
    so I assume that's not the case. Given that - are you considering trying
    to get a copy of tawk if it has asort() instead of just using gawk? If
    so - why?

    Ed.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to manchon.lm@gmail.com on Wed Aug 11 14:32:07 2021
    In article <b0a1d496-2fb2-4e79-bc35-a9f07b2d4f5dn@googlegroups.com>,
    Laurent MANCHON <manchon.lm@gmail.com> wrote:
    --Hi all,

    is asort() function exist in TAWK (or only in gawk) ?

    Not in TAWK, but not really needed (see below).

    However, I'm wondering what the point of the question is. Either you have access to TAWK - in which case you would know and/or could quickly
    determine the answer to the question - or you don't - in which case, the question is moot. So, which is it?

    Anyway, I don't really see the point of asort(), once you have regular
    array sorting (see footnote: *) - which TAWK has always had, and GAWK now
    has. GAWK's implementation of array sorting is actually quite nice - in
    fact, more elaborate and powerful than TAWK's. My sense of the GAWK development effort is that they put asort()/asorti() in at a point in the development when they realized that some sort of array sorting was needed,
    but they weren't quite ready to do full/regular array sorting.

    So, I think asort()/asorti() is now retained mostly for historical reasons.

    (*) By "regular" (or "full") array sorting, I mean via the "for (i in A) ..." syntax.

    --
    Never, ever, ever forget that "Both sides do it" is strictly a Republican meme.

    It is always the side that sucks that insists on saying "Well, you suck, too".

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Laurent MANCHON@21:1/5 to All on Wed Aug 11 07:20:09 2021
    --Hi all,

    is asort() function exist in TAWK (or only in gawk) ?

    thx

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Laurent MANCHON@21:1/5 to All on Wed Aug 11 09:42:59 2021
    I expressed badly myself, I wanted to know if there is a function similar to asort(gawk) in TAWK ?
    As I have to calculate medians I have to sort the values of the table in increasing order and then compute medians on it.
    Hence the need for the asort function. I know I can write my own sort function such as:

    function masort(A, hold, i, j, n) {
    n = alength(A);
    for (i = 2; i <= n ; i++) {
    hold = A[j = i];
    while (A[j-1] > hold) {
    j--;
    A[j+1] = A[j];
    }
    A[j] = hold;
    }
    delete A[0];
    return n;
    }

    But I think that an embedded function is faster than a function created in the header, don't you ?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Laurent MANCHON@21:1/5 to All on Wed Aug 11 09:46:05 2021
    I expressed badly myself, I wanted to know if there is a function similar to asort(gawk) in TAWK ?
    As I have to calculate medians I have to sort the values of the table in increasing order and then compute medians on it.
    Hence the need for the asort function. I know I can write my own sort function such as:

    function alength(A, n, val) {
    n=0;
    for (val in A) n++
    return n;
    }

    function masort(A, hold, i, j, n) {
    n = alength(A);
    for (i = 2; i <= n ; i++) {
    hold = A[j = i];
    while (A[j-1] > hold) {
    j--;
    A[j+1] = A[j];
    }
    A[j] = hold;
    }
    delete A[0];
    return n;
    }

    But I think that an embedded function is faster than a function created in the header, don't you ?
    For exemple in Tawk if i want to compute the length of an array i think *_arr is faster than alength(_arr)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Laurent MANCHON@21:1/5 to All on Wed Aug 11 09:41:44 2021
    Le mercredi 11 août 2021 à 16:32:09 UTC+2, Kenny McCormack a écrit :
    In article <b0a1d496-2fb2-4e79...@googlegroups.com>,
    Laurent MANCHON <manch...@gmail.com> wrote:
    --Hi all,

    is asort() function exist in TAWK (or only in gawk) ?
    Not in TAWK, but not really needed (see below).

    However, I'm wondering what the point of the question is. Either you have access to TAWK - in which case you would know and/or could quickly
    determine the answer to the question - or you don't - in which case, the question is moot. So, which is it?

    Anyway, I don't really see the point of asort(), once you have regular
    array sorting (see footnote: *) - which TAWK has always had, and GAWK now has. GAWK's implementation of array sorting is actually quite nice - in fact, more elaborate and powerful than TAWK's. My sense of the GAWK development effort is that they put asort()/asorti() in at a point in the development when they realized that some sort of array sorting was needed, but they weren't quite ready to do full/regular array sorting.

    So, I think asort()/asorti() is now retained mostly for historical reasons.

    (*) By "regular" (or "full") array sorting, I mean via the "for (i in A) ..."
    syntax.

    --
    Never, ever, ever forget that "Both sides do it" is strictly a Republican meme.

    It is always the side that sucks that insists on saying "Well, you suck, too".
    I expressed badly myself, I wanted to know if there is a function similar to asort(gawk) in TAWK ?
    As I have to calculate medians I have to sort the values of the table in increasing order and then compute medians on it.
    Hence the need for the asort function. I know I can write my own sort function such as:

    function masort(A, hold, i, j, n) {
    n = alength(A);
    for (i = 2; i <= n ; i++) {
    hold = A[j = i];
    while (A[j-1] > hold) {
    j--;
    A[j+1] = A[j];
    }
    A[j] = hold;
    }
    delete A[0];
    return n;
    }

    But I think that an embedded function is faster than a function created in the header, don't you ?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ed Morton@21:1/5 to Laurent MANCHON on Wed Aug 11 12:02:20 2021
    On 8/11/2021 11:46 AM, Laurent MANCHON wrote:
    I expressed badly myself, I wanted to know if there is a function similar to asort(gawk) in TAWK ?
    As I have to calculate medians I have to sort the values of the table in increasing order and then compute medians on it.
    Hence the need for the asort function. I know I can write my own sort function such as:

    function alength(A, n, val) {
    n=0;
    for (val in A) n++
    return n;
    }

    function masort(A, hold, i, j, n) {
    n = alength(A);
    for (i = 2; i <= n ; i++) {
    hold = A[j = i];
    while (A[j-1] > hold) {
    j--;
    A[j+1] = A[j];
    }
    A[j] = hold;
    }
    delete A[0];
    return n;
    }

    But I think that an embedded function is faster than a function created in the header, don't you ?
    For exemple in Tawk if i want to compute the length of an array i think *_arr is faster than alength(_arr)


    Sure but why bother trying to find the equivalent of a gawk function in
    tawk (available by word of mouth from individuals with a copy of it,
    with people mailing photocopies of documentation to each other and a
    small user base) instead of just using gawk (widely/easily available, thoroughly documented online and in books, with a massive user base)?

    Ed.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to manchon.lm@gmail.com on Wed Aug 11 17:31:24 2021
    In article <6a38ebc2-b590-4630-a14a-5db81bf86dbfn@googlegroups.com>,
    Laurent MANCHON <manchon.lm@gmail.com> wrote:
    I expressed badly myself, I wanted to know if there is a function similar to >asort(gawk) in TAWK ?

    OK, now I get it. The keyword is "similar". Without that word, it sounded like you wanted to know if there was literally a function called "asort" in TAWK. We have, correctly, asserted that it would have been easier to just
    test it, then to post to Usenet.

    But you want to know how to sort arrays in TAWK. That's the real point
    that you are getting at.

    The first answer I can give is: No, there is no library function to do it,
    such as asort() in GAWK. But as I argue, there doesn't need to be, and
    asort() in GAWK is basically an anachronism at this point in time.

    To sort arrays in TAWK (and in current/modern versions of GAWK as well),
    you build up your array with the keys (indices) being in the order you want them, then you use: for (i in A) ...
    to iterate through the array in the desired order.

    I hope this answers your question.

    The details are a little different between TAWK and GAWK, but the
    underlying idea is pretty much the same.

    --
    The difference between communism and capitalism?
    In capitalism, man exploits man. In communism, it's the other way around.

    - Daniel Bell, The End of Ideology (1960) -

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Laurent MANCHON@21:1/5 to All on Wed Aug 11 12:16:02 2021
    not really answered.
    Try to calculate the median of the elements of an array and you will understand what I am asking.
    You need to sort not the indice of the array but elements.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Laurent MANCHON on Wed Aug 11 22:47:18 2021
    Laurent MANCHON <manchon.lm@gmail.com> writes:

    As I have to calculate medians I have to sort the values of the table
    in increasing order and then compute medians on it.

    Technically no. There is an O(1), non-sorting median algorithm, but
    it's a bit messy and sorting is so well-understood you are probably
    better off doing what you are doing.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From J Naman@21:1/5 to All on Wed Aug 11 20:19:03 2021
    On Wednesday, 11 August 2021 at 13:02:23 UTC-4, Ed Morton wrote:
    Sure but why bother trying to find the equivalent of a gawk function in
    tawk (available by word of mouth from individuals with a copy of it,
    with people mailing photocopies of documentation to each other and a
    small user base) instead of just using gawk (widely/easily available, thoroughly documented online and in books, with a massive user base)?

    Ed.

    If anyone who REGULARLY CONTRIBUTES to the Gawk community (lang, help) would like my original, NOT A COPY, TAWK Compiler Ver 5.01c, I'll be happy to donate it. I have the original manual, spiral bound, and four 3.5" diskettes for Win 3.1, NT/95, Dos 32-
    bit, and, drum roll, OS/2. In the original box ... Also have Ver 4 & bound manual-- who would want that? I loved it 20+ years ago, but am firmly Gnu awk now.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Laurent MANCHON@21:1/5 to All on Thu Aug 12 00:02:29 2021
    on unix machine i don't like gawk i prefer mawk which is faster than gawk.
    and on windows, compiled program with Tawk v6.7 are faster than gawk.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Laurent MANCHON@21:1/5 to All on Thu Aug 12 01:10:28 2021
    I think Awka has been discontinued for a long time now (http://awka.sourceforge.net/download.html),
    and not sure if it works with the latest versions of gcc.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Laurent MANCHON on Thu Aug 12 09:25:34 2021
    On 12.08.2021 09:02, Laurent MANCHON wrote:
    on unix machine i don't like gawk i prefer mawk which is faster than gawk.

    Is that true? - I know of some performance tests (done by Andrew Sumner
    20+ years ago) where that was actually not the case - some test cases
    were faster, some slower -, and since then a lot of optimizations have
    been done in GNU Awk (including byte code support).

    If you have some test cases I'd be interested to see actual numbers.

    and on windows, compiled program with Tawk v6.7 are faster than gawk.

    If speed is a critical issue you may also try awka, an Awk compiler.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to janis_papanagnou@hotmail.com on Thu Aug 12 10:13:32 2021
    In article <sf2ide$ab7$1@news-1.m-online.net>,
    Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
    On 12.08.2021 09:02, Laurent MANCHON wrote:
    on unix machine i don't like gawk i prefer mawk which is faster than gawk.

    Is that true? - I know of some performance tests (done by Andrew Sumner
    20+ years ago) where that was actually not the case - some test cases
    were faster, some slower -, and since then a lot of optimizations have
    been done in GNU Awk (including byte code support).

    Historically, it is (has been) definitely true. Historically, mawk was
    always considered very fast, and GAWK was originally designed to be feature-rich and not have limits (which are common attributes/goals of GNU software) at the expense of being big and not particularly efficient.
    Note, incidentally, that bash also fits this profile. I like bash for its
    many nice features, but its own man page says that it is too big and too
    slow.

    However, this situation may have changed over the years. As you say,
    effort has gone into making GAWK more runtime efficient.

    and on windows, compiled program with Tawk v6.7 are faster than gawk.

    1) It is unlikely that speed really is an issue. Most people who think it
    is (in pretty much all contexts), turn out to be misguided. If you want efficiency, writing in AWK is probably not what you should be doing in the first place.

    But, that said, it is true (and yes, I am sort of contradicting myself),
    TAWK is very very efficient and fast. This is a good reason to use TAWK,
    if you can. I think it is indisputable that TAWK is the best/fastest significant AWK implementation.

    If speed is a critical issue you may also try awka, an Awk compiler.

    I don't think awka - or any other so-called "awk compiler" - makes any
    claims to making your program run faster. Aren't they all just for
    encryption (aka, code security) purposes?

    BTW, all this talk by you and your c.l.a friend which are of the strain
    "Why don't you just use GAWK like we do?" are misguided. If the OP has and
    is using TAWK, he should continue to do so.

    --
    The randomly chosen signature file that would have appeared here is more than 4 lines long. As such, it violates one or more Usenet RFCs. In order to remain in compliance with said RFCs, the actual sig can be found at the following URL:
    http://user.xmission.com/~gazelle/Sigs/DanaC

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to manchon.lm@gmail.com on Thu Aug 12 10:16:39 2021
    In article <261371b2-4ffd-4cf6-8a3e-d654fc6c46c6n@googlegroups.com>,
    Laurent MANCHON <manchon.lm@gmail.com> wrote:
    on unix machine i don't like gawk i prefer mawk which is faster than gawk. >and on windows, compiled program with Tawk v6.7 are faster than gawk.

    This. I certainly think that if you have TAWK and are using it, you should continue to use it. It is clearly the best and the fastest AWK
    implementation.

    Ignore all the "But you should be using GAWK, because we say so" nonsense
    that you are seeing on this forum.

    --
    "We should always be disposed to believe that which appears to us to be
    white is really black, if the hierarchy of the church so decides."

    - Saint Ignatius Loyola (1491-1556) Founder of the Jesuit Order -

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Laurent MANCHON on Thu Aug 12 12:43:18 2021
    [ please quote context if posting in Usenet ]

    On 12.08.2021 10:10, Laurent MANCHON wrote:
    I think Awka has been discontinued for a long time now (http://awka.sourceforge.net/download.html),
    and not sure if it works with the latest versions of gcc.

    It's discontinued, yes. (And I haven't tried to compile it with
    the latest gcc.)

    But isn't Tawk - that you use on Windows - also discontinued?
    (So I've heard, at least, since many years.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Kenny McCormack on Thu Aug 12 12:56:36 2021
    On 12.08.2021 12:13, Kenny McCormack wrote:
    In article <sf2ide$ab7$1@news-1.m-online.net>,
    Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
    On 12.08.2021 09:02, Laurent MANCHON wrote:

    and on windows, compiled program with Tawk v6.7 are faster than gawk.

    1) It is unlikely that speed really is an issue. Most people who think it
    is (in pretty much all contexts), turn out to be misguided. If you want efficiency, writing in AWK is probably not what you should be doing in the first place.

    Or applying algorithms with better complexity. (C.f. for example Ben's
    hint on an O(N) algorithm, as opposed to an O(N log N) or even an O(N^2) algorithm like the one the OP posted as workaround.)


    If speed is a critical issue you may also try awka, an Awk compiler.

    I don't think awka - or any other so-called "awk compiler" - makes any
    claims to making your program run faster. Aren't they all just for encryption (aka, code security) purposes?

    Don't think so. The performance reference hint I gave was from A. Sumner
    (the author of awka) and you can inspect that all at awka's Sourceforge
    page.


    BTW, all this talk by you and your c.l.a friend which are of the strain
    "Why don't you just use GAWK like we do?" are misguided.

    You have some misconception here; the two persons who suggested GNU Awk
    in this thread were Ed and you.

    I mentioned the performance results and pointed out the optimizations
    that happened in GNU Awk during the past 20+ years since the performance
    tests. (Even those old tests had a comment that it might be outdated by
    the actual awk releases tested.)

    But the OP's argument is anyway strange, WRT speed, and also WRT using discontinued software, and with his assumption that Ed and you are not
    really aware what median-calculation would require, so it's not really
    worth engaging more here in this thread.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to janis_papanagnou@hotmail.com on Thu Aug 12 11:09:55 2021
    In article <sf2up4$do5$1@news-1.m-online.net>,
    Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
    ...
    You have some misconception here; the two persons who suggested GNU Awk
    in this thread were Ed and you.

    Really? I have repeatedly said "If you are using TAWK and are happy with
    it, you should stick with it." I don't recall ever recommending he switch
    to GAWK.

    Eddie has, of course, certainly done so.

    --

    "This ain't my first time at the rodeo"

    is a line from the movie, Mommie Dearest, said by Joan Crawford at a board meeting.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to janis_papanagnou@hotmail.com on Thu Aug 12 11:25:59 2021
    In article <sf2u06$dj4$1@news-1.m-online.net>,
    Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
    [ please quote context if posting in Usenet ]

    On 12.08.2021 10:10, Laurent MANCHON wrote:
    I think Awka has been discontinued for a long time now >(http://awka.sourceforge.net/download.html),
    and not sure if it works with the latest versions of gcc.

    It's discontinued, yes. (And I haven't tried to compile it with
    the latest gcc.)

    But isn't Tawk - that you use on Windows - also discontinued?
    (So I've heard, at least, since many years.)

    Yes, but it doesn't matter (in the case of TAWK).

    Yes, I know that one of the first commandments of using software is that
    you can't use software that isn't being maintained. Your PHB will can your ass!

    And it looks like AWKa fits the mold. Since AWKs is basically a shim
    between GAWK and GCC, you'd have to verify that it works with the current versions of both of those pieces of software. Since it is not being maintained, it almost certainly isn't compatible with one or both of them.

    TAWK is different, though. Since it is:
    1) (almost) Perfect
    and
    2) Entirely standalone
    the fact that it is not being maintained is irrelevant.

    BTW, I said (almost) above because there is one area where I prefer GAWK.
    That is when dealing with files with very long lines - in my work, this involves lines of several hundred thousands of bytes. TAWK fails badly if
    your input lines are too long - and I say "fails badly" because it doesn't generate error messages; it just generates incorrect results.

    There are workarounds, but it is a PIA - and, of course, you have to notice
    the incorrect results (and convince yourself that the bug is not in *your* code) in order to know to deploy the workarounds.

    BTW, I don't use TAWK much anymore, because I don't use Windows much
    anymore, but when I do use Windows, I tend to use (Cygwin) GAWK, because:
    1) The line length problem mentioned above.
    2) Compatibility. I can develop on Linux and deploy on Windows.

    --
    The randomly chosen signature file that would have appeared here is more than 4 lines long. As such, it violates one or more Usenet RFCs. In order to remain in compliance with said RFCs, the actual sig can be found at the following URL:
    http://user.xmission.com/~gazelle/Sigs/IceCream

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Laurent MANCHON@21:1/5 to All on Thu Aug 12 05:13:18 2021
    --
    i don't work with very long lines, but i work with very big text files with million of rows.
    In in my opinion the main drawback of all awk's family is the string concatenation, it takes too much time,
    and this is what I have noted.
    Maybe it's common to all languages, I don't know how C handles it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Laurent MANCHON@21:1/5 to All on Thu Aug 12 05:31:41 2021
    typically this kind of concatenation:
    ...
    if(!(list[i])){list[i]=array[i,j];}
    else{list[i]=list[i] SUBSEP array[i,j];}

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Laurent MANCHON on Thu Aug 12 14:24:39 2021
    On 12.08.2021 14:13, Laurent MANCHON wrote:
    --
    i don't work with very long lines, but i work with very big text files with million of rows.
    In in my opinion the main drawback of all awk's family is the string concatenation, it takes too much time,
    and this is what I have noted.

    Do you mean arbitrary string value concatenations, or adding strings
    to an existing string? The latter, i.e. x = x a b c ..., has in GNU
    Awk an optimization that makes it very fast.

    Janis

    Maybe it's common to all languages, I don't know how C handles it.



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ed Morton@21:1/5 to All on Fri Aug 13 11:06:50 2021
    On 8/12/2021 7:13 AM, Laurent MANCHON wrote:

    [When posting please include enough context that your post makes sense stand-alone - this is usenet, not a forum]

    --
    i don't work with very long lines, but i work with very big text files with million of rows.
    In in my opinion the main drawback of all awk's family is the string concatenation, it takes too much time,
    and this is what I have noted.
    Maybe it's common to all languages, I don't know how C handles it.



    C handles it by requiring you to allocate enough memory up front when
    you declare your variables for the maximum that might be required for
    that variable. So in C you do something like (pseudo-code):

    char x[50] # allocate 50 chars space in memory for x.
    x = "foo" # the first 3 chars of x are populated with "foo".
    x = x "bar" # the first 6 chars of x are populated with "foobar".

    since x already has enough memory to add "bar" to the end but you don't
    declare variables in awk so the equivalent is:

    x = "foo" # allocate 3 chars space in memory for x and
    # populate it with "foo".
    x = x "bar" # allocate 6 chars space in memory for x, populate it
    # with "foobar", and change the reference for "x" to
    # point to the new memory location if required.

    It's probably not exactly that simple and I'm sure gawk at least has
    some optimizations for it but that gives you idea of why awk has more
    work to do than C when concatenating a string to an existing variable -
    static memory allocation for it in C vs dynamic memory allocation in awk.

    Ed.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)