• Re: [Python-ideas] yield functionality to match that of await

    From Chris Angelico@21:1/5 to Dom Grigonis on Tue Jun 13 08:58:25 2023
    On Tue, 13 Jun 2023 at 08:29, Dom Grigonis <dom.grigonis@gmail.com> wrote:
    I don't know if there's a full explanation written down, but one
    important reason is that you can refactor async functions without
    worrying about suddenly changing their behaviour unintentionally. If
    you happen to refactor out the last "yield" from a generator, suddenly
    it's not a generator any more, and you'll have a weird time trying to figure out what happened; but an async function is an async function
    even if it doesn't (currently) have any await points in it.

    For that it would have been easier to introduce one extra keyword like `ydef` or `gdef` as opposed to a completely new subset of a language.

    I think that was discussed, but the benefits were ONLY that one point,
    with the downside that there'd be these confusing distinctions between "functions defined this way" and "functions defined that way" that
    only occasionally crop up. Not a good thing.

    Another reason is that you can have asynchronous generator functions,
    which use both await AND yield. It's not easy to create a two-layer yielding system on top of yield alone.

    Well for that could just create a some sort of efficient standard iterator class if one wants a generator that is not triggered by coroutine loop. And set it to be the default constructor for generator comprehensions.

    I suppose so, but that seems a bit awkward. And "default constructor
    for generator comprehensions" isn't really a thing so I'm not sure how
    that would be implemented.

    As you seem to know this topic reasonably well, hope you don’t mind me explaining the thought process I am having.

    I am currently just trying to choose my toolkit and I am having big doubts about asyc/await frameworks. I have done some benchmarking and `gevent` library is faster than all of them with one exception: asyncio with uvloop. Correct me if I am wrong, but
    the main selling point for all that hassle with coroutines is speed. If that is so, then the overhead that async/await libraries have introduced is hardly cutting. So I was just thinking if there were any reasons of not extending already existing
    keywords and functionality, which at least as it stands now has much lower overhead.

    Maybe you know any good benchmarks/materials that could provide me some sense in all this?


    Hmm, I'd say coroutines aren't really about speed, they're about
    throughput. Let's take a TCP socket server as an example (this could
    be a web server, a MUD server, an IRC server, anything else). You want
    a basic architecture like this:

    def handle_client(sock):
    while True:
    line = get_line(sock) # buffered read, blah blah
    if line == "quit": break
    sock.send("blah blah")
    sock.close()

    def run_server():
    mainsock = socket.socket()
    mainsock.bind(...)
    mainsock.listen(5)
    while sock := mainsock.accept():
    print("Got a connection!", sock.getpeername())
    handle_client(sock)

    (If I've messed something up, apologies, this is hastily whipped up
    from memory. Do not use this as a template for actual socket servers,
    it's probably buggy and definitely incomplete. Anyhow.)

    Now, obviously this version kinda sucks. It has to handle one client
    completely before going on to the next, which is utterly terrible for
    a long-running protocol like TELNET, and pretty poor even for HTTP
    0.9. So what are the options for concurrency?

    1) Spawn a subprocess for every client. Maximum overhead, maximum isolation.
    2) Spin off a new thread for every client. Much much less overhead but
    still notable.
    3) Run an event loop using select.select() or equivalent, managing
    everything manually.
    4) Use an asyncio event loop.

    Individual processes scales abysmally, although occasionally the
    isolation is of value (FTP servers come to mind). Threads is a lot
    better, and I've run plenty of threaded servers in the past, but
    you'll run into thread limits before you run into socket limits, so at
    some point, threads will restrict your throughput.

    A lightweight event loop is by far the lowest overhead. This MIGHT be
    measured as higher performance, but it's usually more visible in
    throughput (for example, "100 transactions per second" vs "1000
    transactions per second"). However, doing this manually can be a bit
    of a pain, as you basically have to implement your own state machine.
    Not too bad for some types of server, but annoying for anything that's
    at all stateful.

    So that's where asyncio comes in. It's approximately the same runtime
    overhead as an event loop, but with way WAY less coding overhead. In
    fact, the code looks basically the same as threaded or forking code -
    you just say "here, go run this thing in a separate task" instead of
    juggling all the separate state machines.

    "All that hassle" is, of course, a relative term. So my first question
    is: what are you comparing against? If you're comparing against
    running your own select.select() loop, it's way *less* hassle, but
    compared to using threads, I'd say it's pretty comparable - definitely
    some additional hassle (since you have to use nonblocking calls
    everywhere), but ideally, not too much structural change.

    In my opinion, asyncio is best compared against threads, although with
    upcoming Python versions, there may be an additional option to compare
    against, which is threads that run separate subinterpreters (see PEP
    554 and PEP 684 for details). In the ideal situation (both PEPs
    accepted and implemented, and each client getting a thread with a
    dedicated subinterpreter), this could sit nicely between threads and
    processes, giving an in-between level of isolation, performance,
    overhead, and throughput.

    ChrisA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)