Hello,
Here is my just new invention of a scalable algorithm and my other new inventions..
I have just read the following PhD paper about the invention that we
call counting networks and they are better than Software combining trees:
Counting Networks
http://people.csail.mit.edu/shanir/publications/AHS.pdf
And i have read the following PhD paper:
http://people.csail.mit.edu/shanir/publications/HLS.pdf
So as you are noticing they are saying in the conclusion that:
"Software combining trees and counting networks which are the only
techniques we observed to be truly scalable"
But i just found that this counting networks algorithm is not generally scalable, and i have the logical proof here, this is why i have just
come with a new invention that enhance the counting networks algorithm
to be generally scalable. So you have to be careful with the actual
counting networks algorithm that is not generally scalable.
More philosophy about my kind of works..
I just written the following:
--
More philosophy about my way of doing..
You have to know me more, since i have just posted about Computer
Science vs Software Engineering, but i am not like
Computer Science or Software Engineering, because i am an inventor
of many software scalable algorithms and algorithms, and i have invented
some powerful software tools, so my way of doing is being innovative and creative and inventive, so i am like a PhD researcher, and i am writing
some books about my inventions and about my powerful tools etc.
--
I will give an example of how i am an inventive and creative, i have
just read the following book (and of other books like it) of a PhD
researcher about operational research and capacity planning, here they are:
Performance by Design: Computer Capacity Planning by Example
https://www.amazon.ca/Performance-Design-Computer-Capacity-Planning/dp/0130906735
So i have just found that there methodologies of those PhD researchers
for the E-Business service don't work, because they are doing
calculations for a given arrival rate that is statistically and
empirically measured from the behavior of customers, but i think that it
is not correct, so i am being inventive and i have come with my new
methodology that fixes the arrival rate from the data by using an hyperexponential service distribution(and it is mathematical) since it
is also good for Denial-of-Service (DoS) attacks and i will write a
powerful book about it that will teach my new methodology and i will
also explain the mathematics behind it and i will sell it, and my new methodology will work for cloud computing and for computer servers.
More about my inventions of scalable algorithms..
More precision about my new inventions of scalable algorithms..
And look at my below powerful inventions of LW_Fast_RWLockX and
Fast_RWLockX that are two powerful scalable RWLocks that are FIFO fair
and Starvation-free and costless on the reader side
(that means with no atomics and with no fences on the reader side), they
use sys_membarrier expedited on Linux and FlushProcessWriteBuffers() on windows, and if you look at the source code of my LW_Fast_RWLockX.pas
and Fast_RWLockX.pas inside the zip file, you will notice that in Linux
they call two functions that are membarrier1() and membarrier2(), the membarrier1() registers the process's intent to use MEMBARRIER_CMD_PRIVATE_EXPEDITED and membarrier2() executes a memory
barrier on each running thread belonging to the same process as the
calling thread.
Read more here to understand:
https://man7.org/linux/man-pages/man2/membarrier.2.html
Here is my new powerful inventions of scalable algorithms..
I have just updated my powerful inventions of LW_Fast_RWLockX and
Fast_RWLockX that are two powerful scalable RWLocks that are FIFO fair
and Starvation-free and costless on the reader side (that means with no
atomics and with no fences on the reader side), they use sys_membarrier expedited on Linux and FlushProcessWriteBuffers() on windows, and now
they work with both Linux and Windows, and i think my inventions are
really smart, since read the following PhD researcher,
he says the following:
"Until today, there is no known efficient reader-writer lock with starvation-freedom guarantees;"
Read more here:
http://concurrencyfreaks.blogspot.com/2019/04/onefile-and-tail-latency.html
So as you have just noticed he says the following:
"Until today, there is no known efficient reader-writer lock with starvation-freedom guarantees;"
So i think that my above powerful inventions of scalable reader-writer
locks are efficient and FIFO fair and Starvation-free.
LW_Fast_RWLockX that is a lightweight scalable Reader-Writer Mutex that
uses a technic that looks like Seqlock without looping on the reader
side like Seqlock, and this has permitted the reader side to be
costless, it is fair and it is of course Starvation-free and it does
spin-wait, and also Fast_RWLockX a lightweight scalable Reader-Writer
Mutex that uses a technic that looks like Seqlock without looping on the
reader side like Seqlock, and this has permitted the reader side to be costless, it is fair and it is of course Starvation-free and it does not spin-wait, but waits on my SemaMonitor, so it is energy efficient.
You can read about them and download them from my website here:
https://sites.google.com/site/scalable68/scalable-rwlock
About the Linux sys_membarrier() expedited and the windows FlushProcessWriteBuffers()..
I have just read the following webpage:
https://lwn.net/Articles/636878/
And it is interesting and it says:
---
Results in liburcu:
Operations in 10s, 6 readers, 2 writers:
memory barriers in reader: 1701557485 reads, 3129842 writes
signal-based scheme: 9825306874 reads, 5386 writes
sys_membarrier expedited: 6637539697 reads, 852129 writes
sys_membarrier non-expedited: 7992076602 reads, 220 writes
---
Look at how "sys_membarrier expedited" is powerful.
Cache-coherency protocols do not use IPIs, and as a user-space level
developer you do not care about IPIs at all. One is most interested in
the cost of cache-coherency itself. However, Win32 API provides a
function that issues IPIs to all processors (in the affinity mask of the current process) FlushProcessWriteBuffers(). You can use it to
investigate the cost of IPIs.
When i do simple synthetic test on a dual core machine I've obtained
following numbers.
420 cycles is the minimum cost of the FlushProcessWriteBuffers()
function on issuing core.
1600 cycles is mean cost of the FlushProcessWriteBuffers() function on
issuing core.