• /dev/stdin on an Event Stream

    From Stephen@21:1/5 to All on Fri Jun 11 22:33:57 2021
    I have an gawk program that takes as input Events Stream

    $CURL -s https://domain.com/path-to-stream | prog.awk

    prog.awk
    ----
    while ((getline line < "/dev/stdin") > 0) {
    <does a lot of stuff>
    }

    So far it is working great (yay awk), but I wonder because "does a lot of stuff" might take a while some seconds, and I don't want to drop any incoming stream, which can be a lot. Not even sure what is doing the buffering - the OS via the pipe? Curl?
    Awk? Can data be lost without knowing? Thanks for any insight.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Stephen on Sat Jun 12 10:12:17 2021
    On 12.06.2021 07:33, Stephen wrote:
    I have an gawk program that takes as input Events Stream

    $CURL -s https://domain.com/path-to-stream | prog.awk

    prog.awk
    ----
    while ((getline line < "/dev/stdin") > 0) {
    <does a lot of stuff>
    }

    This fragment (with <does a lot of stuff> expanded) is not a correct
    awk program. (Is that code in a BEGIN block? Is there a #! line at
    top?)

    Why don't you just write

    { line = $0 ; <does a lot of stuff> }

    So far it is working great (yay awk), but I wonder because "does a
    lot of stuff" might take a while some seconds, and I don't want to
    drop any incoming stream, which can be a lot. Not even sure what is
    doing the buffering - the OS via the pipe? Curl? Awk? Can data be
    lost without knowing? Thanks for any insight.


    What format has the "Events Stream" data? Is it line-oriented text?
    Binary data? (Is the awk text processor the right tool for that data?)

    All data from curl (assuming that is what's hidden in the CURL shell
    variable) will be passed to awk, neither the OS nor awk "drop" data
    passed through the pipe.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ed Morton@21:1/5 to Stephen on Sat Jun 12 08:56:40 2021
    On 6/12/2021 12:33 AM, Stephen wrote:
    I have an gawk program that takes as input Events Stream

    $CURL -s https://domain.com/path-to-stream | prog.awk

    prog.awk
    ----
    while ((getline line < "/dev/stdin") > 0) {
    <does a lot of stuff>
    }

    So far it is working great (yay awk), but I wonder because "does a lot of stuff" might take a while some seconds, and I don't want to drop any incoming stream, which can be a lot. Not even sure what is doing the buffering - the OS via the pipe? Curl?
    Awk? Can data be lost without knowing? Thanks for any insight.


    If "does a lot of stuff" takes some seconds then that "stuff" is almost certainly not just manipulating text and so I suspect you're trying to
    use awk like a shell and calling external tools from it. That's probably
    a bad idea. If you'd like help post a minimal, complete script along
    with concise, testable sample input (output of curl) and expected output
    so we can help you.

    Ed.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)