• About the strategy of "work depth-first; steal breadth-first"..

    From Wisdom90@21:1/5 to All on Sun Dec 29 15:33:50 2019
    Hello..


    About the strategy of "work depth-first; steal breadth-first"..

    I have just read the following webpage:

    Why Too Many Threads Hurts Performance, and What to do About It

    https://www.codeguru.com/cpp/sample_chapter/article.php/c13533/Why-Too-Many-Threads-Hurts-Performance-and-What-to-do-About-It.htm

    Also I have just looked at the following interesting video about Go
    scheduler and Go concurrency:

    Dmitry Vyukov — Go scheduler: Implementing language with lightweight concurrency

    https://www.youtube.com/watch?v=-K11rY57K7k

    And i have just read the following webpage about the Threadpool of
    microsoft .NET 4.0:

    https://blogs.msdn.microsoft.com/jennifer/2009/06/26/work-stealing-in-net-4-0/


    And as you are noticing the first web link above is speaking about the
    strategy of "work depth-first; steal breadth-first" , but we have to be
    more smart because i think that this strategy, that is advantageous for
    cache locality, works best for recursive algorithms, because a thread is
    taking the first task and after that the algorithm is recursive, so it
    will put the childs tasks inside the local work-stealing queue, and the
    other threads will start to take from the work-stealing queue, so the
    work will be distributed correctly, but as you will notice that this
    strategy works best for recursive algorithms, but when you you
    iteratively start many tasks, i think we will have much more contention
    on the work-stealing queue and this is a weakness of this strategy,
    other than that when it is not a recursive algorithm and the threads are receiving from the global queue so there will be high contention on the
    global queue and this is not good. MIT's Cilk and Go scheduler and the Threadpool of Microsoft and Intel® C++ TBB are using this strategy of
    "work depth-first; steal breadth-first". And as you are noticing that
    they are giving more preference to cache locality than scalability.

    But in my following invention of a Threadpool that scales very well i am
    giving more preference to scalability than to cache locality:

    https://sites.google.com/site/scalable68/an-efficient-threadpool-engine-with-priorities-that-scales-very-well

    Other than that when you are doing IO with my Threadpool, you can
    use asychronous IO by starting a dedicated thread to IO to be more
    efficient, or you can start another of my Threadpool and use it for
    tasks that uses IO, you can use the same method when threads of the my Threadpool are waiting or sleeping..

    Other than that for recursion and the stack overflow problem
    you can convert your function from a recursive to iterative
    to solve the problem of stack overflow.

    Other than that to be able to serve a great number of internet
    connections or TCP/IP socket connections you can use my Threadpool
    with my powerful Object oriented Stackful coroutines library for Delphi
    and FreePascal here:

    https://sites.google.com/site/scalable68/object-oriented-stackful-coroutines-library-for-delphi-and-freepascal



    Thank you,
    Amine Moulay Ramdane.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Wisdom90@21:1/5 to All on Mon Jan 20 15:53:52 2020
    Hello,


    About the strategy of "work depth-first; steal breadth-first"..

    I have just read the following webpage:

    Why Too Many Threads Hurts Performance, and What to do About It

    https://www.codeguru.com/cpp/sample_chapter/article.php/c13533/Why-Too-Many-Threads-Hurts-Performance-and-What-to-do-About-It.htm

    Also I have just looked at the following interesting video about Go
    scheduler and Go concurrency:

    Dmitry Vyukov — Go scheduler: Implementing language with lightweight concurrency

    https://www.youtube.com/watch?v=-K11rY57K7k

    And i have just read the following webpage about the Threadpool of
    microsoft .NET 4.0:

    https://blogs.msdn.microsoft.com/jennifer/2009/06/26/work-stealing-in-net-4-0/


    And as you are noticing the first web link above is speaking about the
    strategy of "work depth-first; steal breadth-first" , but we have to be
    more smart because i think that this strategy, that is advantageous for
    cache locality, works best for recursive algorithms, because a thread is
    taking the first task and after that the algorithm is recursive, so it
    will put the childs tasks inside the local work-stealing queue, and the
    other threads will start to take from the work-stealing queue, so the
    work will be distributed correctly, but as you will notice that this
    strategy works best for recursive algorithms, but when you you
    iteratively start many tasks, i think we will have much more contention
    on the work-stealing queue and this is a weakness of this strategy,
    other than that when it is not a recursive algorithm and the threads are receiving from the global queue so there will be high contention on the
    global queue and this is not good. MIT's Cilk and Go scheduler and the Threadpool of Microsoft and Intel® C++ TBB are using this strategy of
    "work depth-first; steal breadth-first". And as you are noticing that
    they are giving more preference to cache locality than scalability.

    But in my following invention of a Threadpool that scales very well i am
    giving more preference to scalability than to cache locality:

    https://sites.google.com/site/scalable68/an-efficient-threadpool-engine-with-priorities-that-scales-very-well

    Other than that when you are doing IO with my Threadpool, you can
    use asychronous IO by starting a dedicated thread to IO to be more
    efficient, or you can start another of my Threadpool and use it for
    tasks that uses IO, you can use the same method when threads of the my Threadpool are waiting or sleeping..

    Other than that for recursion and the stack overflow problem
    you can convert your function from a recursive to iterative
    to solve the problem of stack overflow.

    Other than that to be able to serve a great number of internet
    connections or TCP/IP socket connections you can use my Threadpool
    with my powerful Object oriented Stackful coroutines library for Delphi
    and FreePascal here:

    https://sites.google.com/site/scalable68/object-oriented-stackful-coroutines-library-for-delphi-and-freepascal



    Thank you,
    Amine Moulay Ramdane.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)