• AWK As A Major Systems Programming Language (1/2)

    From Ben Collver@21:1/5 to All on Sun Aug 18 00:28:21 2024
    AWK As A Major Systems Programming Language ===========================================
    by Arnold Robbins, June, 2024

    Preface
    =======
    I started this paper in 2013, and in 2015 sent it out for review to
    the people listed later on. After incorporating comments, I sent it
    to Rik Farrow, the editor of the USENIX magazine ;login: to see if he
    would publish it. He declined to do so, for reasonably good reasons.

    The paper languished, forgotten, until early 2018 when I came across
    it and decided to polish it off, put it up on GitHub, and make it
    available from my home page in HTML.

    In 2024, I took a fresh look at it, and decided to polish it a little
    bit more.

    If you are interested in language design and evolution in general,
    and in Awk in particular, I hope you will enjoy reading this paper.
    If not, then why are you bothering looking at it now?

    Arnold Robbins
    Nof Ayalon, ISRAEL
    June, 2024

    1 Introduction
    ==============
    At the March 1991 USENIX conference, Henry Spencer presented a paper
    entitled AWK As A Major Systems Programming Language. In it, he
    described his experiences using the original version of awk to write
    two significant "systems" programs--a clone for a reasonable subset
    of the nroff formatter [1], and a simple parser generator.

    He described what awk did well, as well as what it didn't, and
    presented a list of things that awk would need to acquire in order to
    take the position of a reasonable alternative to C for systems
    programming tasks on Unix systems.

    In particular, awk lies about in the middle of the spectrum between
    C, which is "close to the metal," and the shell, which is quite
    high-level. A language at this level that is useful for doing systems programming is very desirable.

    This paper reviews Henry's wish list, and describes some of the
    events that have occurred in the Unix/Linux world since 1991. It
    presents a case that gawk--GNU Awk--fills most of the major needs
    Henry listed way back in 1991, and then describes the author's
    opinion as to why other languages have successfully filled the
    systems programming role which awk did not. It discusses how the
    current version of gawk may finally be able to join the ranks of
    other popular, powerful, scripting languages in common use today, and
    ends off with some counter-arguments and the author's responses to
    them.

    Acknowledgements
    ----------------
    Thanks to Andrew Schorr, Henry Spencer, Nelson H.F. Beebe, and Brian
    Kernighan for reviewing an earlier draft of this paper.

    2 That Was Then ...
    ===================
    In this section we review the state of the Unix world in 1991, as
    well as the state of awk, and then list what Henry Spencer saw as
    missing for awk.

    * The Unix World in 1991
    * What Awk Lacked In 1991

    2.1 The Unix World in 1991
    ==========================
    Undoubtedly, many readers of this paper were not using computers in
    1991, so this section provides the context in which Henry's paper was
    written. In March of 1991:

    * Commercial Unix systems were the norm, with offerings from AT&T,
    Digital Equipment Corporation, Hewlett Packard, IBM, Sun
    Microsystems, and many others, all vying for market share.
    Microsoft Windows existed, but was primarily a layer on top of
    MS-DOS and was not taken seriously.
    * Very few sites still ran the original Bell Labs or direct-from-UCB
    variants of Unix; those did not keep up with the available hardware
    and AT&T was itself trying to succeed in the Unix hardware market.
    * GNU/Linux did not exist! Some unencumbered BSD variants were
    available, but they were still under the cloud of the AT&T/UCB law
    suit. [2]
    * So-called "new" awk was about 2.5 years old. The book by Aho,
    Weinberger and Kernighan was published in October of 1987, so most
    people knew about new awk, but they just couldn't get it.

    Who could? New awk was available to educational institutions from
    the Bell Labs research group, and to those who had Unix source
    licenses for System V Releases 3.1, 3.2, and 4. By this time,
    source licensees were an extremely rare breed, since the cost for
    commercial licenses had skyrocketed, and even for educational
    licensees it had increased greatly. [3] If I recall correctly, an
    educational license cost around US $1,000, considerably more than
    the earlier Unix licenses.

    * PERL [4] existed and was starting to gain in popularity. In 1991,
    "PERL" most likely meant PERL 3 or a very early version of PERL 4.
    The World Wide Web, which was one of the major reasons for PERL's
    growth in popularity, had not yet really taken off.
    * Other implementations of new awk were available:
    + MKS Awk for PC systems (MS-DOS).
    + GNU Awk was available and relatively stable, but could not be
    called "solid."

    The problem with the first of these is that source code was not
    available. And the latter came with (to quote Henry) "troublesome
    licenses." (Actually, Henry no longer remembers whether his
    statement about "troublesome licenses" referred to the GPL, or to
    the Bell Labs source licenses.)

    * Michael Brennan's mawk (also GPL'ed) was not yet available. Version
    1.0 was accepted for posting in comp.sources.reviewed on September
    30, 1991, half a year after Henry's paper was published.

    2.2 What Awk Lacked In 1991
    ===========================
    Here is a summary of what was wrong with the awk picture in 1991.
    These are in the same order as presented Henry's paper. We qualify
    each issue in order to later discuss how it has been addressed over
    time.

    * New awk was not widely available. Most Unix vendors still shipped
    only old awk. (Here is where he mentions that "the
    independently-available implementations either cost substantial
    amounts of money or come with troublesome [sic] licenses.") His
    point then was that for portability, awk programs had to be
    restricted to old awk.

    This could be considered a quality of implementation issue,
    although it's really a "lack of available implementation"
    issue.

    * There is no way to tell awk to start matching all its patterns over
    again against the existing $0. This is a language design issue.
    * There is no array assignment. (Language design issue.)
    * Getting an error message out to standard error is difficult.
    (Implementation issue.)
    * There is no precise language specification for awk. This leads to
    gratuitous portability problems. This too is thus a quality of
    implementation issue, in that without a specification, it's
    difficult to produce uniform, high quality implementations.
    * The existing widely available implementation is slow; a much faster
    implementation is needed and the best thing of all would be an
    optimizing compiler. (Implementation issue.)
    * There is no awk-level debugger. (Support tool or quality of
    implementation issue.)
    * There is no awk-level profiler. (Support tool or quality of
    implementation issue.)

    In private email, Henry added the following items, saying "there are
    a couple more things I'd add now, in hindsight." These are direct
    quotes:

    * [I can't believe I didn't discuss this in the paper, because I was
    certainly aware of it then!] Lack of any convenient mechanism for
    adding libraries. When awk is being invoked from a shell file, the
    shell file can do substitutions or use multiple -f options, but
    those are mechanisms outside the language, and not very convenient
    ones. What's really wanted is something like you get in Python
    etc., where one little statement up near the top says "arrange for
    this program to have the xyz library available when it runs."
    * I think it was Rob Pike who later said (roughly): "It says
    something bad about Awk that in a language with integrated regular
    expressions, you end up using substr() so often." My paper did
    allude to the difficulty of finding out where something matched in
    old-awk programs, but even in new awk, what you get is a number
    that you then have to feed to substr(). The language could really
    use some more convenient way of dissecting a string using regexp
    matching. [Caveat: I have not looked lately at Gawk to see if it
    has one.]

    The first of these is somewhere between a language design and a
    language implementation issue. The latter is a language design issue.

    3 ... And This Is Now
    =====================
    Fast forward to 2024. Where do things stand?

    * What Awk Has Today
    * And What GNU Awk Has Today
    * So Where Does Awk Stand?

    3.1 What Awk Has Today
    ======================
    The state of the awk world is much better now. In the same order:

    * New awk is the standard version of awk today on GNU/Linux, BSD, and
    commercial Unix systems. The one notable exception is Solaris,
    where /usr/bin/awk is still the old one; on all other systems,
    plain awk is some version of new awk.
    * There remains no way to tell awk to start matching all its patterns
    over again against the existing $0. Furthermore, this is a feature
    that has not been called for by the awk community, except in
    Henry's paper. (We do acknowledge that this might be a useful
    feature.)
    * There continues to be no array assignment. However, this function
    in gawk, which has arrays of arrays, can do the trick nicely. It is
    also efficient, since gawk uses reference counted strings
    internally:

    function copy_array(dest, source, i, count)
    {
    delete dest

    for (i in source) {
    if (typeof(source[i]) == "array")
    count += copy_array(dest[i], source[i])
    else {
    dest[i] = source[i]
    count++
    }
    }

    return count
    }

    * Getting error messages out is easier. All modern systems have a
    /dev/stderr special file to which error messages may be sent
    directly. gawk, mawk and Brian Kernighan's awk all have
    "/dev/stderr" built in for I/O redirections, so even on systems
    without a real /dev/stderr special file, you can still send error
    messages to standard error.
    * Perhaps most important of all, with the POSIX standard, there is a
    formal standard specification for awk. As with all formal
    standards, it isn't perfect. But it provides an excellent starting
    point, as well as chapter and verse to cite when explaining the
    behavior of a standards-compliant version of awk.
    <https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html>

    Additionally, the second edition of The AWK Progamming Language is
    now available.

    * There are a number of freely available implementations, with
    different licenses, such that everyone ought to be able to find a
    suitable one:

    * Brian Kernighan's awk is the direct lineal descendant of Unix
    awk. He calls it the "One True Awk" (sic). It is available from
    Github:

    $ git clone git://github.com/onetrueawk/awk bwkawk

    * GNU Awk, gawk, is available from the Free Software Foundation.
    You may use an HTTPS downloader:
    <https://ftp.gnu.org/gnu/gawk/gawk-5.3.0.tar.gz> is the current
    version. There may be a newer one.

    * Michael Brennan's awk, known as mawk. In 2009, Thomas Dickey took
    on mawk maintenance. Basic information is available on the
    project's web page. The download URL is
    <https://invisible-island.net/datafiles/release/mawk.tar.gz>

    In 2017 Michael published a beta of mawk 2.0. It's available from
    the project's GitHub page.
    <https://github.com/mikebrennan000/mawk-2>

    * MKS Awk was used for Solaris's /usr/xpg4/bin/awk, which is their
    standards-compliant version of new awk. For a while it was
    available as part of Open Solaris, but is no longer so. Some
    years ago, we were able to make this version compile and run on
    GNU/Linux after just a few hours work.

    Although Open Solaris is now history, the Illumos project does
    make the MKS Awk available. You can view the files one at a time
    from
    <https://github.com/joyent/illumos-joyent/blob/master/usr/src/
    cmd/awk_xpg4>
    <https://illumos.org/>

    * Other, more esoteric versions as well. See the Wikipedia article,
    and also the gawk documentation.
    <https://en.wikipedia.org/wiki/Awk_language
    #Versions_and_implementations>
    <https://www.gnu.org/software/gawk/manual/html_node/
    Other-Versions.html#Other-Versions>

    3.2 And What GNU Awk Has Today
    ==============================
    The more difficult of the quality of implementation issues are
    addressed by gawk. In particular:

    * Beginning with version 4.0 in 2011, gawk provides an awk-level
    debugger which is modeled after GDB. This is a full debugger, with
    breakpoints, watchpoints, single statement stepping and expression
    evaluation capabilities. (Older versions had a separate executable
    named dgawk. Today it's built into regular gawk.)
    * gawk has provided an awk-level statement profiler for many years
    (pgawk). Although there is no direct correlation with CPU time
    used, the statement level profiler remains a powerful tool for
    understanding program behavior.
    * Since version 4.0, gawk has had an '@include' facility whereby gawk
    goes and finds the named awk source progrm. For much longer it has
    searched for files specified with -f along the path named by the
    AWKPATH environment variable. The '@include' mechanism also uses
    AWKPATH.
    * In terms of getting at the pieces of text matched by a regular
    expression, gawk provides an optional third argument to the match()
    function. This argument is an array which gawk fills in with both
    the matched text for the full regexp and subexpressions, and index
    and length information for use with substr(). gawk also provides
    the gensub() general substitution function, an enhanced version of
    the split() function, and the patsplit() function for specifying
    contents instead of separators using a regexp.

    While gawk has almost always been faster than Brian Kernighan's awk, performance improvements bring it closer to mawk's performance level
    (a byte-code based execution engine and internal improvements in
    array indexing).

    And gawk clearly has the most features of any version, many of which considerably increase the power of the language.

    3.3 So Where Does Awk Stand?
    ============================
    Despite all of the above, gawk is not as popular as other scripting
    languages. Since 1991, we can point to four major scripting languages
    which have enjoyed, or currently enjoy, differing levels of
    popularity: PERL, tcl/tk, Python, and Ruby. We think it is fair to
    say that Python is the most popular scripting languages in the third
    decade of the 21st century.

    Is awk, as we've described it up to this point, now ready to compete
    with the other languages? Not quite yet.

    4 Key Reasons Why Other Languages Have Gained Popularity ========================================================
    In retrospect, it seems clear (at least to us!) that there are two
    major reasons that all of the previously mentioned languages have
    enjoyed significant popularity. The first is their extensibility. The
    second is namespace management.

    One certainly cannot attribute their popularity to improved syntax.
    In the opinion of many, PERL and Ruby both suffer from terrible
    syntax. Tcl's syntax is readable but nothing special. Python's syntax
    is elegant, although slightly unusual. The point here is that they
    all differ greatly in syntax, and none really offers the clean
    pattern–action paradigm that is awk's trademark, yet they are all
    popular languages.

    If not syntax, then what? We believe that their popularity stems from
    the fact that all of these languages are easily extensible. This is
    true with both "modules" in the scripting language, and more
    importantly, with access to C level facilities via dynamic library
    loading.

    Furthermore, these languages allow you to group related functions and
    variables into packages or modules: they let you manage the namespace.

    awk, on the other hand, has always been closed. An awk program cannot
    even change its working directory, much less open a connection to an
    SQL database or a socket to a server on the Internet somewhere
    (although gawk can do the latter).

    If one examines the number of extensions available for PERL on CPAN,
    or for Python such as PyQt or the Python tk bindings, it becomes
    clear that extensibility is the real key to power (and from there to popularity).

    Furthermore, in awk, all global variables and functions share a
    single namespace. This prevents many good software development
    practices based on the principle of information hiding.

    To summarize: A reasonable language definition, efficient
    implementations, debuggers and profilers are necessary but not
    sufficient for true power. The final ingredients are extensibility
    and namespaces.

    5 Filling The Extensibility Gap
    ===============================
    With version 4.1, gawk (finally) provides a defined C API for
    extending the core language.

    * API Overview
    * Discussion
    * Future Work

    5.1 API Overview
    ================
    The API makes it possible to write functions in C or C++ that are
    callable from an awk program as if the function were written in awk.
    The most straightforward way to think of these functions is as
    user-defined functions that happen to be implemented in a different
    language.

    The API provides the following facilities:

    * Structures that map awk string, numeric, and undefined values into
    C types that can be worked with.
    * Management of function parameters, including the ability to convert
    a parameter whose original type is undefined, into an array. That
    is, there is full call-by-reference for arrays. Scalars are passed
    by value, of course.
    * Access to the symbol table. Extension functions can read all awk
    variables, and create and update new variables. As an initial,
    relatively arbitrary design decision, extensions cannot update
    special variables such as NR or NF, with the single exception of
    PROCINFO.
    * Full array management, including the ability to create arrays, and
    arrays of arrays, and the ability to add and delete elements from
    an array. It is also possible to "flatten" an array into a data
    structure that makes it simple for C code to loop over all the
    elements of an array.
    * The ability to run a procedure when gawk exits. This is
    conceptually the same as the C atexit() function.
    * Hooks into the built-in I/O redirection mechanisms in gawk. In
    particular, there are separate facilities for input redirections
    with getline and '<', output redirections with print or printf and
    '>' or '>>', and two-way pipelines with gawk's '|&' operator.

    5.2 Discussion
    ==============
    Considerable thought went into the design of the API. The gawk
    documentation provides a full description of the API itself, with
    examples (over 50 pages worth!), as well as some discussion of the
    goals and design decisions behind the API (in an appendix). The
    development was done over the course of about a year and a half,
    together with the developers of xgawk, a fork of gawk that added
    features that made using extensions easier, and included an extension
    for processing XML files in a way that fit naturally with the
    pattern–action paradigm. While it may not be perfect, the gawk
    developers feel that it is a good start.

    <https://www.gnu.org/software/gawk/manual/html_node/ Dynamic-Extensions.html#Dynamic-Extensions>

    <https://www.gnu.org/software/gawk/manual/html_node/ Extension-Design.html#Extension-Design>

    FIXME: Henry Spencer suggests adding more info on the API and on the
    design decisions. I think this paper is long enough, and the full doc
    is quite big. It'd be hard to pull API doc into this paper in a
    reasonable fashion, although it would be possible to review some of
    the design decisions. Comments?

    The major xgawk additions to the C code base have been merged into
    gawk, and the extensions from that project have been rewritten to use
    the new API. As a result, the xgawk project developers renamed their
    project gawkextlib, and the project now provides only extensions. [5]

    It is notable that functions written in awk can do a number of things
    that extension functions cannot, such as modify any variables, do
    I/O, call awk built-in functions, and call other user-defined
    functions.

    While it would certainly be possible to provide APIs for all of these
    features for extension functions, this seemed to be overkill.
    Instead, the gawk developers took the view that extension functions
    should provide access to external facilities, and provide
    communication to the awk level via function parameters and/or global
    variables, including associative arrays, which are the only real data structure.

    Consider a simple example. The standard du program can recursively
    walk one or more arbitrary file hierarchies, call stat() to retrieve
    file information, and then sum up the blocks used. In the process, du
    must track hard links, so that no file is accounted for or reported
    more than once.

    The 'filefuncs' extension shipped with gawk provides a stat()
    function that takes a pathname and fills in an associative array with
    the information retrieved from stat(). The array elements have names
    like "size", "mtime" and so on, with corresponding appropriate
    values. (Compare this to PERL's stat() function that returns a
    linearly-indexed array!)

    The fts() function in the 'filefuncs' extension builds on stat() to
    create a multidimensional array of arrays that describes the
    requested file hierarchies, with each element being an array filled
    in by stat(). Directories are arrays containing elements for each
    directory entry, with an element named "." for the array
    itself.

    Gven that fts() does the heavy lifting, du can be written quite
    nicely, and quite portably [6], in awk. See Awk Code For du, for the
    code, which weighs in at under 250 lines. Much of this is comments
    and argument parsing.

    <http://www.skeeve.com/awk-sys-prog.html#du-in-awk>

    5.3 Future Work
    ===============
    The extension facility is relatively new, and undoubtedly has
    introduced new "dark corners" into gawk. These remain to be uncovered
    and any new bugs need to be shaken out and removed.

    Some issues are known and may not be resolvable. For example, 64-bit
    integer values such as the timestamps in stat() data on modern
    systems don't fit into awk's 64-bit double-precision numbers which
    only have 53 bits of significand. This is also a problem for the bit-manipulation functions.

    With respect to namespaces, in 2017 I (finally) figured out how
    namespaces in awk ought to work to provide the needed functionality
    while retaining backwards compatibility. The was released with gawk
    5.0.

    One or two of the sample extensions shipped with gawk and in
    gawkextlib have been modified to take advantage of namespaces.

    6 Counterpoints
    ===============
    Brian Kernighan raised several counterpoints in response to an
    earlier draft of the paper. They are worth addressing (or at least
    trying to):

    I'm not 100% convinced by your basic premise, that the lack of an
    extension mechanism is the main / a big reason why Awk isn't used
    for the kinds of system programming tasks that Perl, Python, etc.,
    are. It's absolutely a factor--without such a mechanism, there's
    just no way to do a lot of important computations. But how does
    that trade off against just having built-in mechanisms for the core
    system programming facilities (as Perl does) or a handful of core
    libraries like sys, os, regex, etc., for Python?

    I think that Perl's original inclusion of most of the Unix system
    calls was, from a language design standpoint, ultimately a mistake.
    At the time it was first done, there was no other choice: dynamic
    loading of libraries didn't exist on Unix systems in the early and
    mid-1980s (nor did shared libraries, for that matter). But having all
    those built-in functions bloats the language, making it harder to
    learn, document, and maintain, and I definitely did not wish to go
    down that path for gawk.

    With respect to Python, the question is: how are those libraries
    implemented? Are they built-in to the interpreter and separated from
    the "core" language simply by the language design? Or are they
    dynamically loaded modules?

    If the latter, that sounds like an argument for the case of having
    extensions, not against it. And indeed, this merely emphasizes the
    point made at the end of the previous section, which is that to make
    an extension facility really scalable, you also need some sort of
    namespace / module capability.

    Thus, Brian is correct: an extension facility is needed, but the last
    part of the puzzle would be a module facility in the language. I
    think that I have solved this, and invite the curious reader to
    checkout the current versions of gawk.

    I'm also not convinced that Awk is the right language for writing
    things that need extensions. It was originally designed for
    1-liners, and a lot of its constructs don't scale up to bigger
    programs. The notation for function locals is appalling (all my
    fault too, which makes it worse). There's little chance to recover
    from random spelling mistakes and typos; the use of mere adjacency
    for concatenation looks ever more like a bad idea.

    This is hard to argue with. Nonetheless, gawk's --lint option may be
    of help here, as well as the --dump-variables option which produces a
    list of all variables used in the program.

    Awk is fine for its original purpose, but I find myself writing
    Python for anything that's going to be bigger than say 10-20 lines
    unless the lines are basically just longer pattern-action
    sequences. (That notation is a win, of course, which you point
    out.)

    I have worked for several years in Python. For string manipulation
    and processing records, you still have to write all the manual stuff:
    open the file, read lines in a loop, split them, etc. Awk does all
    this stuff for me.

    Additionally, I think that with discipline, it's possible to write
    fairly good-sized, understandable and maintainable awk programs; in
    my experience awk does scale up well beyond the one-liner range.

    Not to mention that Brian published (twice now!) a whole book of awk
    programs larger than one line. :-) (See the Resources section.)

    Some of my own, good-sized awk programs are available from GitHub:

    The TexiWeb Jr. literate programming system -------------------------------------------
    See <https://github.com/arnoldrobbins/texiwebjr>. The suite has two
    programs that total over 1,300 lines of awk. (They share some code.)

    Prepinfo
    --------
    See <https://github.com/arnoldrobbins/prepinfo>. This script
    processes Texinfo files, updating menus as needed. This version is
    rewritten in TexiWeb Jr.; it's about 350 lines of awk.

    Sortmail
    --------
    See <https://github.com/arnoldrobbins/sortmail>. This script sorts a
    Unix mbox format mailbox by thread. I use it daily. It's also written
    in TexiWeb Jr. and is about 330 lines of awk.

    Brian continues:

    The du example is awfully big, though it does show off some of the
    language features. Could you get the same mileage with something
    quite a bit shorter?

    My definition of "small" and "big" has changed over time. 250 lines
    may be big for a script, but the du.awk program is much smaller than
    a full implementation in C: GNU du is over 1,100 lines of C, plus all
    the libraries it relies upon in the GNU Coreutils.

    With respect to shorter examples, nothing springs to mind
    immediately. However, gawk comes with several useful extensions that
    are worth exploring, much more than we've covered here.

    For example, the readdir extension in the gawk distribution causes
    gawk to read directories and return one record per directory entry in
    an easy-to-parse format:

    $ gawk -lreaddir '{ print }' .
    -| 2109292/mail.mbx/f
    -| 2109295/awk-sys-prog.texi/f
    -| 2100007/./d
    -| 2100056/texinfo.tex/f
    -| 2100055/cleanit/f
    -| 2109282/awk-sys-prog.pdf/f
    -| 2100009/du.awk/f
    -| 2100010/.git/d
    -| 2098025/../d
    -| 2109294/ChangeLog/f

    How cool is that?!? :-)

    Also, the gawkextlib project provides some very interesting
    extensions. Of particular interest are the XML and JSON extensions,
    but there are a number of others, and it's worth checking out.

    In 2018 I wrote here:

    In short, it's too early to really tell. This is the beginning of
    an experiment. I hope it will be a fun journey for me, the other
    gawk maintainers, and the larger community of awk users.

    In 2024, I have to say that extensions haven't particularly caught
    on. This saddens me, but it seems to be typical of awk users that
    they use what's in the language and aren't interested in extending
    it, or they don't know that they can. Sigh.

    7 Conclusion
    ============
    It has taken much longer than any awk fan would like, but finally,
    GNU Awk fills in almost all the gaps listed by Henry Spencer for awk
    to be really useful as a systems programming language.

    In addition, experience from other popular languages has shown that extensibility and namespaces are the keys to true power, usability,
    and popularity.

    With the release of gawk 4.1, we feel that gawk (and thus the Awk
    language) are now almost on par with the basic capabilities of other
    popular languages. With gawk 5.0, we hope(d) to truly reach par.

    Is it too late in the game? In 2024, sadly, it does seem to be. But
    at least I had fun adding the new features to gawk.

    I hope that this paper will have piqued your curiosity, and that you
    will take the time to give gawk a fresh look.

    Appendix A Resources
    ====================
    1. The AWK Programming Language Paperback, second edition,
    Alfred V. Aho, Brian W. Kernighan, and Peter J. Weinberger.
    Addison-Wesley, 2023. ISBN-13: 978-0138269722,
    ISBN-10: 0138269726.
    2. Effective awk Programming, fourth edition. Arnold Robbins.
    O'Reilly Media, 2015. ISBN-13: 978-1491904619,
    ISBN-10: 1491904615.
    3. Online version of the gawk documentation:
    <https://www.gnu.org/software/gawk/manual/>
    4. The gawkextlib project:
    <https://sourceforge.net/projects/gawkextlib/>

    Appendix B Awk Code For du
    ==========================
    Here ithe du program, written in Awk. Besides demonstrating the power
    of the stat() and fts() extensions and gawk's multidimensional
    arrays, it also shows the switch statement and the built-in bit
    manipulation functions and(), or(), and compl().

    The output is not identical to GNU du's, since filenames are not
    sorted. However, gawk's built-in sorting facilities should make
    sorting the output straightforward; we leave that as the traditional
    "exercise for the reader."

    #! /usr/local/bin/gawk -f

    # du.awk --- write POSIX du utility in awk.
    # See https://pubs.opengroup.org/onlinepubs/9699919799/utilities/du.html
    #
    # Most of the heavy lifting is done by the fts() function in the "filefuncs"
    # extension.
    #
    # We think this conforms to POSIX, except for the default block size, which
    # is set to 1024. Following GNU standards, set POSIXLY_CORRECT in the
    # environment to force 512-byte blocks.
    #
    # Arnold Robbins
    # arnold@skeeve.com

    @include "getopt"
    @load "filefuncs"

    BEGIN {
    FALSE = 0
    TRUE = 1

    BLOCK_SIZE = 1024 # Sane default for the past 30 years
    if ("POSIXLY_CORRECT" in ENVIRON)
    BLOCK_SIZE = 512 # POSIX default

    compute_scale()

    fts_flags = FTS_PHYSICAL
    sum_only = FALSE
    all_files = FALSE

    while ((c = getopt(ARGC, ARGV, "aHkLsx")) != -1) {
    switch (c) {
    case "a":
    # report size of all files
    all_files = TRUE;
    break
    case "H":
    # follow symbolic links named on the command line
    fts_flags = or(fts_flags, FTS_COMFOLLOW)
    break
    case "k":
    BLOCK_SIZE = 1024 # 1K block size
    break
    case "L":
    # follow all symbolic links

    # fts_flags &= ~FTS_PHYSICAL
    fts_flags = and(fts_flags, compl(FTS_PHYSICAL))

    # fts_flags |= FTS_LOGICAL
    fts_flags = or(fts_flags, FTS_LOGICAL)
    break
    case "s":
    # do sums only
    sum_only = TRUE
    break
    case "x":
    # don't cross filesystems
    fts_flags = or(fts_flags, FTS_XDEV)

    [continued in next message]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Ram@21:1/5 to Ben Collver on Sun Aug 18 11:07:50 2024
    Ben Collver <bencollver@tilde.pink> wrote or quoted:
    AWK As A Major Systems Programming Language

    A systems programming language, in my book, is one you can
    crank out device drivers in and tap into the platform ABI.

    In retrospect, it seems clear (at least to us!) that there are two
    major reasons that all of the previously mentioned languages have
    enjoyed significant popularity. The first is their extensibility. The
    second is namespace management.

    That totally makes me think of the "Zen of Python":

    |The Zen of Python, by Tim Peters
    |
    |Beautiful is better than ugly.
    |Explicit is better than implicit.
    |Simple is better than complex.
    |Complex is better than complicated.
    |Flat is better than nested.
    |Sparse is better than dense.
    |Readability counts.
    |Special cases aren't special enough to break the rules.
    |Although practicality beats purity.
    |Errors should never pass silently.
    |Unless explicitly silenced.
    |In the face of ambiguity, refuse the temptation to guess.
    |There should be one-- and preferably only one --obvious way to do it. |Although that way may not be obvious at first unless you're Dutch.
    |Now is better than never.
    |Although never is often better than *right* now.
    |If the implementation is hard to explain, it's a bad idea.
    |If the implementation is easy to explain, it may be a good idea.
    |Namespaces are one honking great idea -- let's do more of those!
    .

    I have worked for several years in Python. For string manipulation
    and processing records, you still have to write all the manual stuff:
    open the file, read lines in a loop, split them, etc. Awk does all
    this stuff for me.

    On the flip side, you can peep it like this: Python's got a solid
    set of statement types you can use for everything, making the code
    hella readable. Meanwhile, awk's got its bag of tricks for special
    cases like file and string processing. Just compare [1] with [2].

    [1]

    #!/usr/bin/awk -f

    # This AWK script analyzes a simple CSV file containing book information:
    # Title,Author,Year,Price

    BEGIN {
    FS = ","
    print "Book Analysis Report"
    print "===================="
    }

    {
    if (NR > 1) { # Skip header row
    total_price += $4
    if ($3 < min_year || min_year == 0) min_year = $3
    if ($3 > max_year) max_year = $3

    author_count[$2]++
    year_count[$3]++
    }
    }

    END {
    print "\nTotal number of books:", NR - 1
    print "Average book price: $" sprintf("%.2f", total_price / (NR - 1))
    print "Year range:", min_year, "to", max_year

    print "\nBooks per author:"
    for (author in author_count)
    print author ":", author_count[author]

    print "\nBooks per year:"
    for (year in year_count)
    print year ":", year_count[year]
    }

    [2]

    #!/usr/bin/env python3

    import csv
    from dataclasses import dataclass
    from collections import Counter
    from typing import List, Dict, Tuple

    @dataclass
    class Book:
    title: str
    author: str
    year: int
    price: float

    class BookAnalyzer:
    def __init__(self, books: List[Book]):
    self.books = books

    def total_books(self) -> int:
    return len(self.books)

    def average_price(self) -> float:
    return sum(book.price for book in self.books) / len(self.books)

    def year_range(self) -> Tuple[int, int]:
    years = [book.year for book in self.books]
    return min(years), max(years)

    def books_per_author(self) -> Dict[str, int]:
    return Counter(book.author for book in self.books)

    def books_per_year(self) -> Dict[int, int]:
    return Counter(book.year for book in self.books)

    def read_csv(filename: str) -> List[Book]:
    with open(filename, 'r') as f:
    reader = csv.reader(f)
    next(reader) # Skip header row
    return [Book(title, author, int(year), float(price))
    for title, author, year, price in reader]

    def print_report(analyzer: BookAnalyzer) -> None:
    print("Book Analysis Report")
    print("====================")
    print(f"\nTotal number of books: {analyzer.total_books()}")
    print(f"Average book price: ${analyzer.average_price():.2f}")
    min_year, max_year = analyzer.year_range()
    print(f"Year range: {min_year} to {max_year}")

    print("\nBooks per author:")
    for author, count in analyzer.books_per_author().items():
    print(f"{author}: {count}")

    print("\nBooks per year:")
    for year, count in analyzer.books_per_year().items():
    print(f"{year}: {count}")

    def main() -> None:
    books = read_csv("books.csv")
    analyzer = BookAnalyzer(books)
    print_report(analyzer)

    if __name__ == "__main__":
    main()

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Computer Nerd Kev@21:1/5 to Stefan Ram on Mon Aug 19 08:04:11 2024
    Stefan Ram <ram@zedat.fu-berlin.de> wrote:
    I have worked for several years in Python. For string manipulation
    and processing records, you still have to write all the manual stuff:
    open the file, read lines in a loop, split them, etc. Awk does all
    this stuff for me.

    On the flip side, you can peep it like this: Python's got a solid
    set of statement types you can use for everything, making the code
    hella readable. Meanwhile, awk's got its bag of tricks for special
    cases like file and string processing. Just compare [1] with [2].

    [1]
    [snip]

    Weird, as someone who doesn't use much of Python or AWK, I look at
    your examples as clearly demonstrating that the AWK version is much
    easier to read. I wonder if the Python version is more complicated
    than it needs to be actually.

    --
    __ __
    #_ < |\| |< _#

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Ben Collver on Sun Aug 18 23:36:28 2024
    On Sun, 18 Aug 2024 00:28:21 -0000 (UTC), Ben Collver wrote:

    He described what awk did well, as well as what it didn't, and presented
    a list of things that awk would need to acquire in order to take the
    position of a reasonable alternative to C for systems programming tasks
    on Unix systems.

    It was soon obsoleted by Perl, which did everything Awk did, just as
    concisely, and more besides.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Kettlewell@21:1/5 to Computer Nerd Kev on Mon Aug 19 08:40:11 2024
    not@telling.you.invalid (Computer Nerd Kev) writes:
    Stefan Ram <ram@zedat.fu-berlin.de> wrote:
    On the flip side, you can peep it like this: Python's got a solid set
    of statement types you can use for everything, making the code hella
    readable. Meanwhile, awk's got its bag of tricks for special cases
    like file and string processing. Just compare [1] with [2].

    [1]
    [snip]

    Weird, as someone who doesn't use much of Python or AWK, I look at
    your examples as clearly demonstrating that the AWK version is much
    easier to read. I wonder if the Python version is more complicated
    than it needs to be actually.

    Yes, you could do a direct translation from the Awk and end up with
    something that looked quite similar, apart from the differences in
    syntax.

    --
    https://www.greenend.org.uk/rjk/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Ram@21:1/5 to Richard Kettlewell on Mon Aug 19 11:55:16 2024
    Richard Kettlewell <invalid@invalid.invalid> wrote or quoted:
    Yes, you could do a direct translation from the Awk and end up with
    something that looked quite similar, apart from the differences in
    syntax.

    Trying to make that Python script more user-friendly for folks
    who dig awk, then rolling out a sleeker version . . .

    #!/usr/bin/env python3

    import sys
    import csv

    # Initialize variables
    total_price = 0
    min_year = 0
    max_year = 0
    author_count = {}
    year_count = {}

    # Print the header of the report
    print("Book Analysis Report")
    print("====================")

    # Read CSV data from standard input
    reader = csv.reader(sys.stdin)
    next(reader) # Skip the header row

    # Process each row in the CSV
    for row in reader:
    title, author, year, price = row
    year = int(year)
    price = float(price)

    # Accumulate total price
    total_price += price

    # Determine min and max year
    if min_year == 0 or year < min_year:
    min_year = year
    if year > max_year:
    max_year = year

    # Count books per author
    if author in author_count:
    author_count[author] += 1
    else:
    author_count[author] = 1

    # Count books per year
    if year in year_count:
    year_count[year] += 1
    else:
    year_count[year] = 1

    # Calculate total number of books
    total_books = sum(author_count.values())

    # Print the report
    print("\nTotal number of books:", total_books)
    print("Average book price: $%.2f" % (total_price / total_books))
    print("Year range:", min_year, "to", max_year)

    print("\nBooks per author:")
    for author, count in author_count.items():
    print(author + ":", count)

    print("\nBooks per year:")
    for year, count in year_count.items():
    print(str(year) + ":", count)

    No need to make it such a big production, though . . .

    import csv, sys
    from collections import Counter

    with open(sys.argv[1]) as f:
    data = list(csv.reader(f))[1:]
    prices = [float(row[3]) for row in data]
    years = [int(row[2]) for row in data]
    authors = [row[1] for row in data]

    print("Book Analysis Report\n====================")
    print(f"\nTotal number of books: {len(data)}")
    print(f"Average book price: ${sum(prices) / len(prices):.2f}")
    print(f"Year range: {min(years)} to {max(years)}")

    print("\nBooks per author:")
    for author, count in Counter(authors).items():
    print(f"{author}: {count}")

    print("\nBooks per year:")
    for year, count in Counter(years).items():
    print(f"{year}: {count}")

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Ram@21:1/5 to Stefan Ram on Mon Aug 19 12:26:12 2024
    ram@zedat.fu-berlin.de (Stefan Ram) wrote or quoted:
    Trying to make that Python script more user-friendly for folks
    who dig awk, then rolling out a sleeker version . . .

    The setup with a class and a bunch of named functions had some perks,
    though: The class and function names are like a cheat sheet, spilling
    the beans on what the coder was up to. They keep things chill by
    breaking up the work, each one zeroing in on just one part of the gig.
    Plus, they give you a solid jumping-off point for unit tests.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Kettlewell@21:1/5 to Stefan Ram on Mon Aug 19 18:24:04 2024
    ram@zedat.fu-berlin.de (Stefan Ram) writes:
    ram@zedat.fu-berlin.de (Stefan Ram) wrote or quoted:
    Trying to make that Python script more user-friendly for folks
    who dig awk, then rolling out a sleeker version . . .

    The setup with a class and a bunch of named functions had some perks,
    though: The class and function names are like a cheat sheet, spilling
    the beans on what the coder was up to. They keep things chill by
    breaking up the work, each one zeroing in on just one part of the gig.
    Plus, they give you a solid jumping-off point for unit tests.

    Yes, it’s not that it was a bad design as such, but if you’re trying to
    do some kind of conciseness comparison with Awk, it wasn’t really
    comparing like with like.

    --
    https://www.greenend.org.uk/rjk/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From D@21:1/5 to Stefan Ram on Mon Aug 19 19:42:10 2024
    On Mon, 19 Aug 2024, Stefan Ram wrote:

    Richard Kettlewell <invalid@invalid.invalid> wrote or quoted:
    Yes, you could do a direct translation from the Awk and end up with
    something that looked quite similar, apart from the differences in
    syntax.

    Trying to make that Python script more user-friendly for folks
    who dig awk, then rolling out a sleeker version . . .

    Looks pretty neat and understandable to me. =)

    #!/usr/bin/env python3

    import sys
    import csv

    # Initialize variables
    total_price = 0
    min_year = 0
    max_year = 0
    author_count = {}
    year_count = {}

    # Print the header of the report
    print("Book Analysis Report")
    print("====================")

    # Read CSV data from standard input
    reader = csv.reader(sys.stdin)
    next(reader) # Skip the header row

    # Process each row in the CSV
    for row in reader:
    title, author, year, price = row
    year = int(year)
    price = float(price)

    # Accumulate total price
    total_price += price

    # Determine min and max year
    if min_year == 0 or year < min_year:
    min_year = year
    if year > max_year:
    max_year = year

    # Count books per author
    if author in author_count:
    author_count[author] += 1
    else:
    author_count[author] = 1

    # Count books per year
    if year in year_count:
    year_count[year] += 1
    else:
    year_count[year] = 1

    # Calculate total number of books
    total_books = sum(author_count.values())

    # Print the report
    print("\nTotal number of books:", total_books)
    print("Average book price: $%.2f" % (total_price / total_books))
    print("Year range:", min_year, "to", max_year)

    print("\nBooks per author:")
    for author, count in author_count.items():
    print(author + ":", count)

    print("\nBooks per year:")
    for year, count in year_count.items():
    print(str(year) + ":", count)

    No need to make it such a big production, though . . .

    import csv, sys
    from collections import Counter

    with open(sys.argv[1]) as f:
    data = list(csv.reader(f))[1:]
    prices = [float(row[3]) for row in data]
    years = [int(row[2]) for row in data]
    authors = [row[1] for row in data]

    print("Book Analysis Report\n====================")
    print(f"\nTotal number of books: {len(data)}")
    print(f"Average book price: ${sum(prices) / len(prices):.2f}")
    print(f"Year range: {min(years)} to {max(years)}")

    print("\nBooks per author:")
    for author, count in Counter(authors).items():
    print(f"{author}: {count}")

    print("\nBooks per year:")
    for year, count in Counter(years).items():
    print(f"{year}: {count}")


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Computer Nerd Kev@21:1/5 to Stefan Ram on Tue Aug 20 07:43:11 2024
    Stefan Ram <ram@zedat.fu-berlin.de> wrote:
    ram@zedat.fu-berlin.de (Stefan Ram) wrote or quoted:
    Trying to make that Python script more user-friendly for folks
    who dig awk, then rolling out a sleeker version . . .

    The setup with a class and a bunch of named functions had some perks,
    though: The class and function names are like a cheat sheet, spilling
    the beans on what the coder was up to. They keep things chill by
    breaking up the work, each one zeroing in on just one part of the gig.
    Plus, they give you a solid jumping-off point for unit tests.

    I see your point. Depends on what you're trying to read too really
    - what the script's author is trying to achieve, or what exactly
    the script does. To achieve the latter there's more to parse in the
    first version in order to see what's actually happening, and that's
    what I was comparing to the AWK version.

    --
    __ __
    #_ < |\| |< _#

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Shepelev@21:1/5 to All on Wed Aug 21 19:18:08 2024
    Stefan Ram:

    |The Zen of Python, by Tim Peters
    |[...]
    |Flat is better than nested.
    [...]

    Then why does Python lack the ultimate code flattener, the
    `goto' operator, and mandates structural indentation?

    --
    () ascii ribbon campaign -- against html e-mail
    /\ www.asciiribbon.org -- against proprietary attachments

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Anton Shepelev on Wed Aug 21 23:59:08 2024
    On Wed, 21 Aug 2024 19:18:08 +0300, Anton Shepelev wrote:

    Stefan Ram:

    |The Zen of Python, by Tim Peters |[...]
    |Flat is better than nested. [...]

    Then why does Python lack the ultimate code flattener, the `goto'
    operator, and mandates structural indentation?

    I never took that “flat is better than nested” nonsense seriously, anyway. If I need to nest in my Python code, I nest: statement nesting, function nesting, class nesting, whatever makes sense.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Johanne Fairchild@21:1/5 to Lawrence D'Oliveiro on Tue Aug 27 19:43:28 2024
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    On Sun, 18 Aug 2024 00:28:21 -0000 (UTC), Ben Collver wrote:

    He described what awk did well, as well as what it didn't, and presented
    a list of things that awk would need to acquire in order to take the
    position of a reasonable alternative to C for systems programming tasks
    on Unix systems.

    It was soon obsoleted by Perl, which did everything Awk did, just as concisely, and more besides.

    Funny---I gave up on Perl as soon as I discovered the existence of AWK.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From yeti@21:1/5 to Johanne Fairchild on Tue Aug 27 23:50:59 2024
    Johanne Fairchild <jfairchild@tudado.org> writes:

    Funny---I gave up on Perl as soon as I discovered the existence of AWK.

    \o/

    --
    I do not bite, I just want to play.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Johanne Fairchild on Tue Aug 27 23:36:29 2024
    On Tue, 27 Aug 2024 19:43:28 -0300, Johanne Fairchild wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    On Sun, 18 Aug 2024 00:28:21 -0000 (UTC), Ben Collver wrote:

    He described what awk did well, as well as what it didn't, and
    presented a list of things that awk would need to acquire in order to
    take the position of a reasonable alternative to C for systems
    programming tasks on Unix systems.

    It was soon obsoleted by Perl, which did everything Awk did, just as
    concisely, and more besides.

    Funny---I gave up on Perl as soon as I discovered the existence of AWK.

    Have you given up motor-cars in favour of horse-drawn transport as well?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Johanne Fairchild@21:1/5 to Lawrence D'Oliveiro on Tue Aug 27 21:05:23 2024
    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    On Tue, 27 Aug 2024 19:43:28 -0300, Johanne Fairchild wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    On Sun, 18 Aug 2024 00:28:21 -0000 (UTC), Ben Collver wrote:

    He described what awk did well, as well as what it didn't, and
    presented a list of things that awk would need to acquire in order to
    take the position of a reasonable alternative to C for systems
    programming tasks on Unix systems.

    It was soon obsoleted by Perl, which did everything Awk did, just as
    concisely, and more besides.

    Funny---I gave up on Perl as soon as I discovered the existence of AWK.

    Have you given up motor-cars in favour of horse-drawn transport as well?

    I'm sure there are places where motor-cars are totally useless and a
    horse is a blessing.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From D@21:1/5 to Johanne Fairchild on Wed Aug 28 16:25:49 2024
    On Tue, 27 Aug 2024, Johanne Fairchild wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    On Sun, 18 Aug 2024 00:28:21 -0000 (UTC), Ben Collver wrote:

    He described what awk did well, as well as what it didn't, and presented >>> a list of things that awk would need to acquire in order to take the
    position of a reasonable alternative to C for systems programming tasks
    on Unix systems.

    It was soon obsoleted by Perl, which did everything Awk did, just as
    concisely, and more besides.

    Funny---I gave up on Perl as soon as I discovered the existence of AWK.


    Sometimes less is more. It's aesthetics for sure, but for me personally, I
    do not like massive languages that try to do, and be, everything. For fun
    I thought about to have a look at Lua, or possibly, go.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Shepelev@21:1/5 to All on Wed Aug 28 18:37:45 2024
    D:

    Sometimes less is more. It's aesthetics for sure, but for
    me personally, I do not like massive languages that try to
    do, and be, everything. For fun I thought about to have a
    look at Lua, or possibly, go.

    Also, Hare: <https://harelang.org/> .

    --
    () ascii ribbon campaign -- against html e-mail
    /\ www.asciiribbon.org -- against proprietary attachments

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Johanne Fairchild@21:1/5 to nospam@example.net on Fri Aug 30 18:56:47 2024
    D <nospam@example.net> writes:

    On Tue, 27 Aug 2024, Johanne Fairchild wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    On Sun, 18 Aug 2024 00:28:21 -0000 (UTC), Ben Collver wrote:

    He described what awk did well, as well as what it didn't, and presented >>>> a list of things that awk would need to acquire in order to take the
    position of a reasonable alternative to C for systems programming tasks >>>> on Unix systems.

    It was soon obsoleted by Perl, which did everything Awk did, just as
    concisely, and more besides.

    Funny---I gave up on Perl as soon as I discovered the existence of AWK.

    Actually it was after I read ``The AWK Programming Language''.

    Sometimes less is more. It's aesthetics for sure, but for me
    personally, I do not like massive languages that try to do, and be, everything. For fun I thought about to have a look at Lua, or
    possibly, go.

    Lua is a nice language, but it's really small.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From D@21:1/5 to Johanne Fairchild on Sat Aug 31 11:56:55 2024
    On Fri, 30 Aug 2024, Johanne Fairchild wrote:

    D <nospam@example.net> writes:

    On Tue, 27 Aug 2024, Johanne Fairchild wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    On Sun, 18 Aug 2024 00:28:21 -0000 (UTC), Ben Collver wrote:

    He described what awk did well, as well as what it didn't, and presented >>>>> a list of things that awk would need to acquire in order to take the >>>>> position of a reasonable alternative to C for systems programming tasks >>>>> on Unix systems.

    It was soon obsoleted by Perl, which did everything Awk did, just as
    concisely, and more besides.

    Funny---I gave up on Perl as soon as I discovered the existence of AWK.

    Actually it was after I read ``The AWK Programming Language''.

    Sometimes less is more. It's aesthetics for sure, but for me
    personally, I do not like massive languages that try to do, and be,
    everything. For fun I thought about to have a look at Lua, or
    possibly, go.

    Lua is a nice language, but it's really small.


    Ah! So maybe Lua would be my next hobby language to learn. =)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Johanne Fairchild@21:1/5 to nospam@example.net on Sat Aug 31 12:28:57 2024
    D <nospam@example.net> writes:

    [...]

    Funny---I gave up on Perl as soon as I discovered the existence of AWK.

    Actually it was after I read ``The AWK Programming Language''.

    Sometimes less is more. It's aesthetics for sure, but for me
    personally, I do not like massive languages that try to do, and be,
    everything. For fun I thought about to have a look at Lua, or
    possibly, go.

    Lua is a nice language, but it's really small.


    Ah! So maybe Lua would be my next hobby language to learn. =)

    I don't know why people enjoy such small things so much. Sure, it feels
    good to master something, but it's also good to have tools that can do a
    lot for us. I used to be very minimalist, but following my heart
    instead of my brain I found Common Lisp to be the nicest of them all.
    But, sure, if I'm going to embed a language in my executable, Lua is a candidate, though I'd probably go for a small Lisp.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From D@21:1/5 to Johanne Fairchild on Sat Aug 31 22:34:55 2024
    On Sat, 31 Aug 2024, Johanne Fairchild wrote:

    D <nospam@example.net> writes:

    [...]

    Funny---I gave up on Perl as soon as I discovered the existence of AWK. >>>
    Actually it was after I read ``The AWK Programming Language''.

    Sometimes less is more. It's aesthetics for sure, but for me
    personally, I do not like massive languages that try to do, and be,
    everything. For fun I thought about to have a look at Lua, or
    possibly, go.

    Lua is a nice language, but it's really small.


    Ah! So maybe Lua would be my next hobby language to learn. =)

    I don't know why people enjoy such small things so much. Sure, it feels
    good to master something, but it's also good to have tools that can do a
    lot for us. I used to be very minimalist, but following my heart
    instead of my brain I found Common Lisp to be the nicest of them all.
    But, sure, if I'm going to embed a language in my executable, Lua is a candidate, though I'd probably go for a small Lisp.


    In my case, it is because I am not working as a programmer, so I have not requirements to be productive or to be able to generate any income of programming.

    That means, I can just have a go at what I find interesting and beautiful.
    It also means that if I get bored, I can just stop, and have a look at any other languange that catches my interest. =)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Johanne Fairchild@21:1/5 to nospam@example.net on Sun Sep 1 22:52:10 2024
    D <nospam@example.net> writes:

    On Sat, 31 Aug 2024, Johanne Fairchild wrote:

    D <nospam@example.net> writes:

    [...]

    Funny---I gave up on Perl as soon as I discovered the existence of AWK. >>>>
    Actually it was after I read ``The AWK Programming Language''.

    Sometimes less is more. It's aesthetics for sure, but for me
    personally, I do not like massive languages that try to do, and be,
    everything. For fun I thought about to have a look at Lua, or
    possibly, go.

    Lua is a nice language, but it's really small.


    Ah! So maybe Lua would be my next hobby language to learn. =)

    I don't know why people enjoy such small things so much. Sure, it feels
    good to master something, but it's also good to have tools that can do a
    lot for us. I used to be very minimalist, but following my heart
    instead of my brain I found Common Lisp to be the nicest of them all.
    But, sure, if I'm going to embed a language in my executable, Lua is a
    candidate, though I'd probably go for a small Lisp.


    In my case, it is because I am not working as a programmer, so I have
    not requirements to be productive or to be able to generate any income
    of programming.

    I am not programming for profit any longer. Thank God. I program for
    beauty now. This change has been the hardest thing I had to do and it's
    been so worth it.

    That means, I can just have a go at what I find interesting and
    beautiful. It also means that if I get bored, I can just stop, and
    have a look at any other languange that catches my interest. =)

    :-)

    When I felt comfortable with C, I decided to choose a higher-level
    language that could be my main medium of expression. I choose Lisp.
    It's been taking me lots and lots of years, but it's also been so worth
    it. I came up with the Lisp idea observing Barry Margolin here on
    USENET many years ago. It was his discussions on Lisp that brought the language to my attention. I had no idea it existed and I felt it was
    really strange that inteligent people would care so much about such an
    old thing. Newbies.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Johanne Fairchild on Mon Sep 2 03:36:47 2024
    On Sun, 01 Sep 2024 22:52:10 -0300, Johanne Fairchild wrote:

    I had no idea [Lisp] existed and I felt it was
    really strange that inteligent people would care so much about such an
    old thing.

    It is a language that still looks advanced compared to the state of the
    art today. Trouble is, there is no currently workable Lisp “standard” as such: first of all there is the “Lisp-1” (e.g. Scheme) versus “Lisp-2” (e.g. Common Lisp, Elisp) schism; secondly the closest thing to a Lisp “standard” was Common Lisp, and that is really old and full of baggage
    from the time when non-Posix-like operating systems were common. Even
    Scheme seems to have become fragmented.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From D@21:1/5 to Johanne Fairchild on Mon Sep 2 09:45:10 2024
    On Sun, 1 Sep 2024, Johanne Fairchild wrote:

    In my case, it is because I am not working as a programmer, so I have
    not requirements to be productive or to be able to generate any income
    of programming.

    I am not programming for profit any longer. Thank God. I program for
    beauty now. This change has been the hardest thing I had to do and it's
    been so worth it.

    Why? How was it to work as a programmer and what was it that you
    didn't like about it? When I graduated from university, I wanted to
    become a programmer, but at that time, only 10+ years of experience was
    wanted on the job market, so life decided that I should work in infrastructure/system administration instead.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Johanne Fairchild@21:1/5 to nospam@example.net on Mon Sep 2 11:30:28 2024
    D <nospam@example.net> writes:

    On Sun, 1 Sep 2024, Johanne Fairchild wrote:

    In my case, it is because I am not working as a programmer, so I have
    not requirements to be productive or to be able to generate any income
    of programming.

    I am not programming for profit any longer. Thank God. I program for
    beauty now. This change has been the hardest thing I had to do and it's
    been so worth it.

    Why? How was it to work as a programmer and what was it that you
    didn't like about it?

    I never worked on obviously interesting systems. (There was only one exceptional project that I was hired to do and I felt I was doing the
    type of programming that I would call cool programming. This was one of
    the last commercial projects I worked on. By then, I was already an independent contractor, not an employee, so this project does not even
    count as something I did while an employee in a company.) Over the
    years, I felt I was just contributing to the profit of the company owner
    and nothing else---not even my satisfaction was being rewarded, except
    for the bill-paying type of satisfaction (if you would).

    Unfortunately, to pay bills I had to spend more than I wanted of my life
    as a company employee. I had to explicitly design an operation to do a
    career change and that was really worth it.

    When I graduated from university, I wanted to become a programmer, but
    at that time, only 10+ years of experience was wanted on the job
    market, so life decided that I should work in infrastructure/system administration instead.

    I always thought of system administration as a programming job. In
    fact, a fun one. Initially I wanted to be a UNIX system administrator.
    But my professional life began in a web world when most jobs I could get
    were all web related. Deep web projects always involve UNIX
    programming, but I was never really hired for deep projects. As a
    result, I kept doing web programming to pay bills. So I had to study
    and invent projects in order to study the other sides of computer
    science so I would not spend my life with technology and culture I did
    not even appreciate. That actually paid off. For the first time in my
    life, I can say I really like my job.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From D@21:1/5 to Johanne Fairchild on Mon Sep 2 18:31:03 2024
    On Mon, 2 Sep 2024, Johanne Fairchild wrote:

    When I graduated from university, I wanted to become a programmer, but
    at that time, only 10+ years of experience was wanted on the job
    market, so life decided that I should work in infrastructure/system
    administration instead.

    I always thought of system administration as a programming job. In
    fact, a fun one. Initially I wanted to be a UNIX system administrator.

    Yes, having worked as one, I can see that. For me, the pleasure was
    always in automation, and the quick feedback loops. I would work on a
    piece of the infra-stack, automate as much as possible, and you can do
    that in small cycles of days and weeks, instead of the endless bug
    hunting the developers at one of my jobs did, in some kind of million+
    line CAD software. I always got the feeling talking with them, that
    their job would never end, and you would only see small,
    micro-incremental improvements stretching over years.

    Mean while, I'd happily automate my systems, deployments, reports,
    statistics etc. so yes, some kind of programming always was there during
    my time as a linux/unix system administrator.

    But my professional life began in a web world when most jobs I could get
    were all web related. Deep web projects always involve UNIX
    programming, but I was never really hired for deep projects. As a
    result, I kept doing web programming to pay bills. So I had to study
    and invent projects in order to study the other sides of computer
    science so I would not spend my life with technology and culture I did
    not even appreciate. That actually paid off. For the first time in my
    life, I can say I really like my job.

    Happy to hear it! =)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Johanne Fairchild@21:1/5 to nospam@example.net on Mon Sep 2 17:31:33 2024
    D <nospam@example.net> writes:

    On Mon, 2 Sep 2024, Johanne Fairchild wrote:

    When I graduated from university, I wanted to become a programmer, but
    at that time, only 10+ years of experience was wanted on the job
    market, so life decided that I should work in infrastructure/system
    administration instead.

    I always thought of system administration as a programming job. In
    fact, a fun one. Initially I wanted to be a UNIX system administrator.

    Yes, having worked as one, I can see that. For me, the pleasure was
    always in automation, and the quick feedback loops. I would work on a
    piece of the infra-stack, automate as much as possible, and you can do
    that in small cycles of days and weeks, instead of the endless bug
    hunting the developers at one of my jobs did, in some kind of million+
    line CAD software. I always got the feeling talking with them, that
    their job would never end, and you would only see small,
    micro-incremental improvements stretching over years.

    Mean while, I'd happily automate my systems, deployments, reports,
    statistics etc. so yes, some kind of programming always was there during
    my time as a linux/unix system administrator.

    I recognize all of the above. But I think there's an even stronger
    point for system administration back then. When I got introduced to
    UNIX systems, it was a time where there were UNIX users and people would
    still share the system. So UNIX administrators did programming that
    everyone around the system noticed. There were mailing lists, NNTP
    servers and IRC servers so that people living the same area could talk
    to on a daily basis. Getting online and seeing there were people online
    too was a joy.

    The web evolved and computers became cheap, so everyone got their own
    and that seems to have isolated everyone. Instead of talking to your
    neighbor, you'd then interact with a lot of people across the world.
    System administrators got buried. We only notice their presence now
    when things go completely wrong. Today, the new generation of
    programmers have not even heard of W. Richard Stevens. I have no idea
    how they understand the systems they use.

    You offer a shell account to a ``tweenager'' and they decline---thanks,
    but no, thanks. ``I have my own system.'' They see no fun in sharing
    in a UNIX system.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Ram@21:1/5 to Johanne Fairchild on Mon Sep 2 20:51:34 2024
    Johanne Fairchild <jfairchild@tudado.org> wrote or quoted:
    You offer a shell account to a ``tweenager'' and they decline---thanks,
    but no, thanks. ``I have my own system.'' They see no fun in sharing
    in a UNIX system.

    On some shell accounts here, "social commands" (like "finger",
    "who", etc.) have been disabled. It might have something to
    do with the "Datenschutz" ("privacy") laws.

    The admins also do not seem to use "motd" anymore, instead
    system information seems to be published on some web page.

    I had one free shell account about 20 years on a system where
    you could log in and play nethack. I think the highscore list
    was shared.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From D@21:1/5 to Johanne Fairchild on Mon Sep 2 23:08:53 2024
    On Mon, 2 Sep 2024, Johanne Fairchild wrote:

    Yes, having worked as one, I can see that. For me, the pleasure was
    always in automation, and the quick feedback loops. I would work on a
    piece of the infra-stack, automate as much as possible, and you can do
    that in small cycles of days and weeks, instead of the endless bug
    hunting the developers at one of my jobs did, in some kind of million+
    line CAD software. I always got the feeling talking with them, that
    their job would never end, and you would only see small,
    micro-incremental improvements stretching over years.

    Mean while, I'd happily automate my systems, deployments, reports,
    statistics etc. so yes, some kind of programming always was there during
    my time as a linux/unix system administrator.

    I recognize all of the above. But I think there's an even stronger
    point for system administration back then. When I got introduced to
    UNIX systems, it was a time where there were UNIX users and people would still share the system. So UNIX administrators did programming that
    everyone around the system noticed. There were mailing lists, NNTP
    servers and IRC servers so that people living the same area could talk
    to on a daily basis. Getting online and seeing there were people online
    too was a joy.

    Interesting point. Yes, I think there is a strong case for the system administrator to have been put back into the closet. At many big
    universities and companies, these types of services you mention, have
    been outsourced and are purchased "as a service". The system
    administrator only takes care of backend system, and probably the only
    ones who do interact with him are the developers and/or devops people
    (unless the system administrator is of course christened devops at that company).

    Of course there are retro-types who still enjoy email, mailinglists,
    usenet, gopher and irc, but they are few and far in between. So I can definitely see your point here.

    The web evolved and computers became cheap, so everyone got their own
    and that seems to have isolated everyone. Instead of talking to your neighbor, you'd then interact with a lot of people across the world.
    System administrators got buried. We only notice their presence now
    when things go completely wrong. Today, the new generation of
    programmers have not even heard of W. Richard Stevens. I have no idea
    how they understand the systems they use.

    At the risk of disappointing you, I have no idea who Richard Stevens is.
    ;) In terms of collaboration, I think for me and my generation, there
    were still shared spaces, but self-hosting at that time, on the internet
    was out of reach for people who did not go to university. My start in self-hosting was the humble BBS, and that was an excellent technology
    for building a community that also had a local touch.

    You offer a shell account to a ``tweenager'' and they decline---thanks,
    but no, thanks. ``I have my own system.'' They see no fun in sharing
    in a UNIX system.

    Really? I think you must meet more teenagers! I teach, and each class is roughly divided into thirds. 1/3 don't know what to do in life, and just
    sit there. Very tough for a teacher to motivate them. 1/3 at least want
    to pass. They are not naturals, but fight through, and a few of them do discover the passion. Then you have the students that create passion in
    the teacher. The top 1/3 (actually I'd say probably closer to 15%-20%).
    They take to the whole self-hosting, sysadmin culture like ducks to
    water, they explore the packages, they setup their own servers, they collaborate in teams, so the student who has Gbps internet at home sets
    up a server (or laptop) that the others all login to, they create their
    own netflix, their own spotify, they play around with nextcloud creating
    their own OneDrive and collaboration services.

    I still remember one student who came to me 10 months after he started
    saying that learning the terminal was the single best thing he ever
    learned about computers. All his life, he had pointed and clicked, and
    he never realized he could be that efficient and achieve all those
    things (his own netflix, spotify etc.) with free software and linux.

    So I think there is still a movement, and lots of interest, but I think
    that there is perhaps not enough people teaching these things.

    What I see in a lot of schools, is plenty of Azure and AWS consultants, lobbying for the schools dropping linux and moving to "serverless", but
    there is hope! I teach the opposite, so there's at least one person
    fighting that trend. ;)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From yeti@21:1/5 to Johanne Fairchild on Mon Sep 2 23:15:32 2024
    Johanne Fairchild <jfairchild@tudado.org> writes:

    You offer a shell account to a ``tweenager'' and they decline---thanks,
    but no, thanks. ``I have my own system.'' They see no fun in sharing
    in a UNIX system.

    Why not mesh up Peernixens instead of joining Pubnixens? Federating
    with digital neighbours. So most of your stuff would stay at home and
    only what you want to publish appears somewhere else. Maybe limit this
    to SMTP and NNTP in the beginning and allow MIME posts in some
    hierarchies. A safe backbone[0] would take some stress from all the
    other protocols, so none of them would need to have SSL/TLS baked in.
    In such a context mail would be a lightweight service again.

    ____________

    0: SSH? TINC? Tor hidden services? ...

    --
    thejuicemedia
    Honest Government Ad | 🇯🇵 Japan v. Paul Watson 🐋 <https://www.youtube.com/watch?v=QqzOAyXSJMI>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From candycanearter07@21:1/5 to nospam@example.net on Mon Sep 2 23:50:04 2024
    D <nospam@example.net> wrote at 09:56 this Saturday (GMT):


    On Fri, 30 Aug 2024, Johanne Fairchild wrote:

    D <nospam@example.net> writes:

    On Tue, 27 Aug 2024, Johanne Fairchild wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    On Sun, 18 Aug 2024 00:28:21 -0000 (UTC), Ben Collver wrote:

    He described what awk did well, as well as what it didn't, and presented >>>>>> a list of things that awk would need to acquire in order to take the >>>>>> position of a reasonable alternative to C for systems programming tasks >>>>>> on Unix systems.

    It was soon obsoleted by Perl, which did everything Awk did, just as >>>>> concisely, and more besides.

    Funny---I gave up on Perl as soon as I discovered the existence of AWK.

    Actually it was after I read ``The AWK Programming Language''.

    Sometimes less is more. It's aesthetics for sure, but for me
    personally, I do not like massive languages that try to do, and be,
    everything. For fun I thought about to have a look at Lua, or
    possibly, go.

    Lua is a nice language, but it's really small.


    Ah! So maybe Lua would be my next hobby language to learn. =)


    I learned some lua to make aseprite scripts, it is pretty neat but it is
    a bit frustrating to learn (like how specifically instance functions
    MUST be called with :, while static functions are called with .)

    On the other hand, aseprite lua has actually worked consistently unlike
    krita's python
    --
    user <candycane> is generated from /dev/urandom

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to All on Tue Sep 3 02:09:14 2024
    On Mon, 2 Sep 2024 23:50:04 -0000 (UTC), candycanearter07 wrote:

    On the other hand, aseprite lua has actually worked consistently unlike krita's python

    Isn’t it neat how all the major open-source content-creation apps offer a Python API?

    Easily the most extensive of them all has to be Blender: more extensive
    even than the scripting API of any proprietary app.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From D@21:1/5 to Stefan Ram on Tue Sep 3 10:09:22 2024
    On Mon, 2 Sep 2024, Stefan Ram wrote:

    Johanne Fairchild <jfairchild@tudado.org> wrote or quoted:
    You offer a shell account to a ``tweenager'' and they decline---thanks,
    but no, thanks. ``I have my own system.'' They see no fun in sharing
    in a UNIX system.

    On some shell accounts here, "social commands" (like "finger",
    "who", etc.) have been disabled. It might have something to
    do with the "Datenschutz" ("privacy") laws.

    The admins also do not seem to use "motd" anymore, instead
    system information seems to be published on some web page.

    I had one free shell account about 20 years on a system where
    you could log in and play nethack. I think the highscore list
    was shared.


    Nethack... now there's a game I haven't heard about in ages! I know for a
    time, it was quite popular among my circle of acquaintances but it must
    have been around the 90s somewhere.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From D@21:1/5 to All on Tue Sep 3 10:11:18 2024
    On Mon, 2 Sep 2024, candycanearter07 wrote:

    D <nospam@example.net> wrote at 09:56 this Saturday (GMT):


    On Fri, 30 Aug 2024, Johanne Fairchild wrote:

    D <nospam@example.net> writes:

    On Tue, 27 Aug 2024, Johanne Fairchild wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:

    On Sun, 18 Aug 2024 00:28:21 -0000 (UTC), Ben Collver wrote:

    He described what awk did well, as well as what it didn't, and presented
    a list of things that awk would need to acquire in order to take the >>>>>>> position of a reasonable alternative to C for systems programming tasks >>>>>>> on Unix systems.

    It was soon obsoleted by Perl, which did everything Awk did, just as >>>>>> concisely, and more besides.

    Funny---I gave up on Perl as soon as I discovered the existence of AWK. >>>
    Actually it was after I read ``The AWK Programming Language''.

    Sometimes less is more. It's aesthetics for sure, but for me
    personally, I do not like massive languages that try to do, and be,
    everything. For fun I thought about to have a look at Lua, or
    possibly, go.

    Lua is a nice language, but it's really small.


    Ah! So maybe Lua would be my next hobby language to learn. =)


    I learned some lua to make aseprite scripts, it is pretty neat but it is
    a bit frustrating to learn (like how specifically instance functions
    MUST be called with :, while static functions are called with .)

    On the other hand, aseprite lua has actually worked consistently unlike krita's python


    Interesting. Seems like every language has its quirks here and there. On
    the other hand, since I'm not a professional, I never tend to hit the
    really weird stuff, unless it's built in from the start.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From candycanearter07@21:1/5 to Lawrence D'Oliveiro on Thu Sep 5 15:10:04 2024
    Lawrence D'Oliveiro <ldo@nz.invalid> wrote at 02:09 this Tuesday (GMT):
    On Mon, 2 Sep 2024 23:50:04 -0000 (UTC), candycanearter07 wrote:

    On the other hand, aseprite lua has actually worked consistently unlike
    krita's python

    Isn’t it neat how all the major open-source content-creation apps offer a Python API?

    Easily the most extensive of them all has to be Blender: more extensive
    even than the scripting API of any proprietary app.


    Yeah, it is cool, but krita's was objectively broken (at least for me)
    (on version 5.1.5) and I don't use blender much.
    --
    user <candycane> is generated from /dev/urandom

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)