Hello..
About the full paper on the Swarm chip..
I have just read the following full PhD paper from MIT about the new
Swarm chip:
https://people.csail.mit.edu/sanchez/papers/2015.swarm.micro.pdf
I think there are disadvantages with this chip, first it is
using the same mechanisms as Transactional memory, but those mechanisms
of Transactional memory are not so efficient(read below to know more),
but we are already having Intel hardware Transactional memory , and i
don't think it is globally faster on parallelism than actual hardware
and software because look at the writing on the paper about the
benchmarks and you will understand more.
And about Transactional memory and more read my following thoughts:
About Hardware Transactional Memory and my invention that is my powerful
Fast Mutex:
"As someone who has used TSX to optimize synchronization primitives, you
can expect to see a ~15-20% performance increase, if (big if) your
program is heavy on disjoint data access, i.e. a lock is needed for correctness, but conflicts are rare in practice. If you have a lot of
threads frequently writing the same cache lines, you are probably going
to see worse performance with TSX as opposed to traditional locking. It
helps to think about TSX as transparently performing optimistic
concurrency control, which is actually pretty much how it is implemented
under the hood."
Read more here:
https://news.ycombinator.com/item?id=8169697
So as you are noticing, HTM (hardware transactional memory) and TM can
not replace locks when doing IO and for highly contended critical
sections, this is why i have invented my following powerful Fast Mutex:
More about research and software development..
I have just looked at the following new video:
Why is coding so hard...
https://www.youtube.com/watch?v=TAAXwrgd1U8
I am understanding this video, but i have to explain my work:
I am not like this techlead in the video above, because i am also an
"inventor" that has invented many scalable algorithms and there
implementions, i am also inventing effective abstractions, i give you an example:
Read the following of the senior research scientist that is called Dave
Dice:
Preemption tolerant MCS locks
https://blogs.oracle.com/dave/preemption-tolerant-mcs-locks
As you are noticing he is trying to invent a new lock that is preemption tolerant, but his lock lacks some important characteristics, this is why
i have just invented a new Fast Mutex that is adaptative and that is
much much better and i think mine is the "best", and i think you will
not find it anywhere, my new Fast Mutex has the following characteristics:
1- Starvation-free
2- Good fairness
3- It keeps efficiently and very low the cache coherence traffic
4- Very good fast path performance (it has the same performance as the
scalable MCS lock when there is contention.)
5- And it has a decent preemption tolerance.
this is how i am an "inventor", and i have also invented other scalable algorithms such as a scalable reference counting with efficient support
for weak references, and i have invented a fully scalable Threadpool,
and i have also invented a Fully scalable FIFO queue, and i have also
invented other scalable algorithms and there implementations, and i
think i will sell some of them to Microsoft or to Google or Embarcadero
or such software companies.
And about composability of lock-based systems now:
Design your systems to be composable. Among the more galling claims of
the detractors of lock-based systems is the notion that they are somehow uncomposable:
“Locks and condition variables do not support modular programming,”
reads one typically brazen claim, “building large programs by gluing
together smaller programs[:] locks make this impossible.”9 The claim, of course, is incorrect. For evidence one need only point at the
composition of lock-based systems such as databases and operating
systems into larger systems that remain entirely unaware of lower-level locking.
There are two ways to make lock-based systems completely composable, and
each has its own place. First (and most obviously), one can make locking entirely internal to the subsystem. For example, in concurrent operating systems, control never returns to user level with in-kernel locks held;
the locks used to implement the system itself are entirely behind the
system call interface that constitutes the interface to the system. More generally, this model can work whenever a crisp interface exists between software components: as long as control flow is never returned to the
caller with locks held, the subsystem will remain composable.
Second (and perhaps counterintuitively), one can achieve concurrency and composability by having no locks whatsoever. In this case, there must be
no global subsystem state—subsystem state must be captured in
per-instance state, and it must be up to consumers of the subsystem to
assure that they do not access their instance in parallel. By leaving
locking up to the client of the subsystem, the subsystem itself can be
used concurrently by different subsystems and in different contexts. A
concrete example of this is the AVL tree implementation used extensively
in the Solaris kernel. As with any balanced binary tree, the
implementation is sufficiently complex to merit componentization, but by
not having any global state, the implementation may be used concurrently
by disjoint subsystems—the only constraint is that manipulation of a
single AVL tree instance must be serialized.
Read more here:
https://queue.acm.org/detail.cfm?id=1454462
About deadlocks and race conditions in parallel programming..
I have just read the following paper:
Deadlock Avoidance in Parallel Programs with Futures
https://cogumbreiro.github.io/assets/cogumbreiro-gorn.pdf
So as you are noticing you can have deadlocks in parallel programming
by introducing circular dependencies among tasks waiting on future
values or you can have deadlocks by introducing circular dependencies
among tasks waiting on windows event objects or such synchronisation
objects, so you have to have a general tool that detects deadlocks,
but if you are noticing that the tool called Valgrind for C++
can detect deadlocks only happening from Pthread locks , read
the following to notice it:
http://valgrind.org/docs/manual/hg-manual.html#hg-manual.lock-orders
So this is not good, so you have to have a general way that permits
to detect deadlocks on locks , mutexes, and deadlocks from introducing
circular dependencies among tasks waiting on future values or deadlocks
you may have deadlocks by introducing circular dependencies among tasks
waiting on windows event objects or such synchronisation objects etc.
this is why i have talked before about this general way that detects
deadlocks, and here it is, read my following thoughts:
Yet more precision about the invariants of a system..
I was just thinking about Petri nets , and i have studied more
Petri nets, they are useful for parallel programming, and
what i have noticed by studying them, is that there is two methods
to prove that there is no deadlock in the system, there is the
structural analysis with place invariants that you have to
mathematically find, or you can use the reachability tree, but we have
to notice that the structural analysis of Petri nets learns you more,
because it permits you to prove that there is no deadlock in the system,
and the place invariants are mathematically calculated by the following
system of the given Petri net:
Transpose(vector) * Incidence matrix = 0
So you apply the Gaussian Elimination or the Farkas algorithm to
the incidence matrix to find the Place invariants, and as you will
notice those place invariants calculations of the Petri nets look
like Markov chains in mathematics, with there vector of probabilities
and there transition matrix of probabilities, and you can, using
Markov chains mathematically calculate where the vector of probabilities
will "stabilize", and it gives you a very important information, and
you can do it by solving the following mathematical system:
Unknown vector1 of probabilities * transition matrix of probabilities =
Unknown vector1 of probabilities.
Solving this system of equations is very important in economics and
other fields, and you can notice that it is like calculating the
invariants , because the invariant in the system above is the
vector1 of probabilities that is obtained, and this invariant,
like in the invariants of the structural analysis of Petri nets,
gives you a very important information about the system, like where
market shares will stabilize that is calculated this way in economics.
About reachability analysis of a Petri net..
As you have noticed in my Petri nets tutorial example (read below),
i am analysing the liveness of the Petri net, because there is a rule
that says:
If a Petri net is live, that means that it is deadlock-free.
Because reachability analysis of a Petri net with Tina
gives you the necessary information about boundedness and liveness
of the Petri net. So if it gives you that the Petri net is "live" , so
there is no deadlock in it.
Tina and Partial order reduction techniques..
With the advancement of computer technology, highly concurrent systems
are being developed. The verification of such systems is a challenging
task, as their state space grows exponentially with the number of
processes. Partial order reduction is an effective technique to address
this problem. It relies on the observation that the effect of executing transitions concurrently is often independent of their ordering.
Tina is using “partial-order” reduction techniques aimed at preventing combinatorial explosion, Read more here to notice it:
http://projects.laas.fr/tina/papers/qest06.pdf
About modelizations and detection of race conditions and deadlocks
in parallel programming..
I have just taken further a look at the following project in Delphi
called DelphiConcurrent by an engineer called Moualek Adlene from France:
https://github.com/moualek-adlene/DelphiConcurrent/blob/master/DelphiConcurrent.pas
And i have just taken a look at the following webpage of Dr Dobb's journal:
Detecting Deadlocks in C++ Using a Locks Monitor
https://www.drdobbs.com/detecting-deadlocks-in-c-using-a-locks-m/184416644
And i think that both of them are using technics that are not as good
as analysing deadlocks with Petri Nets in parallel applications ,
for example the above two methods are only addressing locks or mutexes
or reader-writer locks , but they are not addressing semaphores
or event objects and such other synchronization objects, so they
are not good, this is why i have written a tutorial that shows my
methodology of analysing and detecting deadlocks in parallel
applications with Petri Nets, my methodology is more sophisticated
because it is a generalization and it modelizes with Petri Nets the
broader range of synchronization objects, and in my tutorial i will add
soon other synchronization objects, you have to look at it, here it is:
https://sites.google.com/site/scalable68/how-to-analyse-parallel-applications-with-petri-nets
You have to get the powerful Tina software to run my Petri Net examples
inside my tutorial, here is the powerful Tina software:
http://projects.laas.fr/tina/
Also to detect race conditions in parallel programming you have to take
a look at the following new tutorial that uses the powerful Spin tool:
https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html
This is how you will get much more professional at detecting deadlocks
and race conditions in parallel programming.
About Java and Delphi and Freepascal..
I have just read the following webpage:
Java is not a safe language
https://lemire.me/blog/2019/03/28/java-is-not-a-safe-language/
But as you have noticed the webpage says:
- Java does not trap overflows
But Delphi and Freepascal do trap overflows.
And the webpage says:
- Java lacks null safety
But Delphi has null safety since i have just posted about it by saying
the following:
Here is MyNullable library for Delphi and FreePascal that brings null
safety..
Java lacks null safety. When a function receives an object, this object
might be null. That is, if you see ‘String s’ in your code, you often
have no way of knowing whether ‘s’ contains an actually String unless
you check at runtime. Can you guess whether programmers always check?
They do not, of course, In practice, mission-critical software does
crash without warning due to null values. We have two decades of
examples. In Swift or Kotlin, you have safe calls or optionals as part
of the language.
Here is MyNullable library for Delphi and FreePascal that brings null
safety, you can read the html file inside the zip to know how it works,
and you can download it from my website here:
https://sites.google.com/site/scalable68/null-safety-library-for-delphi-and-freepascal
And the webpage says:
- Java allows data races
But for Delphi and Freepascal i have just written about how to prevent
data races by saying the following:
Yet more precision about the invariants of a system..
I was just thinking about Petri nets , and i have studied more
Petri nets, they are useful for parallel programming, and
what i have noticed by studying them, is that there is two methods
to prove that there is no deadlock in the system, there is the
structural analysis with place invariants that you have to
mathematically find, or you can use the reachability tree, but we have
to notice that the structural analysis of Petri nets learns you more,
because it permits you to prove that there is no deadlock in the system,
and the place invariants are mathematically calculated by the following
system of the given Petri net:
Transpose(vector) * Incidence matrix = 0
So you apply the Gaussian Elimination or the Farkas algorithm to
the incidence matrix to find the Place invariants, and as you will
notice those place invariants calculations of the Petri nets look
like Markov chains in mathematics, with there vector of probabilities
and there transition matrix of probabilities, and you can, using
Markov chains mathematically calculate where the vector of probabilities
will "stabilize", and it gives you a very important information, and
you can do it by solving the following mathematical system:
Unknown vector1 of probabilities * transition matrix of probabilities =
Unknown vector1 of probabilities.
Solving this system of equations is very important in economics and
other fields, and you can notice that it is like calculating the
invariants , because the invariant in the system above is the
vector1 of probabilities that is obtained, and this invariant,
like in the invariants of the structural analysis of Petri nets,
gives you a very important information about the system, like where
market shares will stabilize that is calculated this way in economics.
About reachability analysis of a Petri net..
As you have noticed in my Petri nets tutorial example (read below),
i am analysing the liveness of the Petri net, because there is a rule
that says:
If a Petri net is live, that means that it is deadlock-free.
Because reachability analysis of a Petri net with Tina
gives you the necessary information about boundedness and liveness
of the Petri net. So if it gives you that the Petri net is "live" , so
there is no deadlock in it.
Tina and Partial order reduction techniques..
With the advancement of computer technology, highly concurrent systems
are being developed. The verification of such systems is a challenging
task, as their state space grows exponentially with the number of
processes. Partial order reduction is an effective technique to address
this problem. It relies on the observation that the effect of executing transitions concurrently is often independent of their ordering.
Tina is using “partial-order” reduction techniques aimed at preventing combinatorial explosion, Read more here to notice it:
http://projects.laas.fr/tina/papers/qest06.pdf
About modelizations and detection of race conditions and deadlocks
in parallel programming..
I have just taken further a look at the following project in Delphi
called DelphiConcurrent by an engineer called Moualek Adlene from France:
https://github.com/moualek-adlene/DelphiConcurrent/blob/master/DelphiConcurrent.pas
And i have just taken a look at the following webpage of Dr Dobb's journal:
Detecting Deadlocks in C++ Using a Locks Monitor
https://www.drdobbs.com/detecting-deadlocks-in-c-using-a-locks-m/184416644
And i think that both of them are using technics that are not as good
as analysing deadlocks with Petri Nets in parallel applications ,
for example the above two methods are only addressing locks or mutexes
or reader-writer locks , but they are not addressing semaphores
or event objects and such other synchronization objects, so they
are not good, this is why i have written a tutorial that shows my
methodology of analysing and detecting deadlocks in parallel
applications with Petri Nets, my methodology is more sophisticated
because it is a generalization and it modelizes with Petri Nets the
broader range of synchronization objects, and in my tutorial i will add
soon other synchronization objects, you have to look at it, here it is:
https://sites.google.com/site/scalable68/how-to-analyse-parallel-applications-with-petri-nets
You have to get the powerful Tina software to run my Petri Net examples
inside my tutorial, here is the powerful Tina software:
http://projects.laas.fr/tina/
Also to detect race conditions in parallel programming you have to take
a look at the following new tutorial that uses the powerful Spin tool:
https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html
This is how you will get much more professional at detecting deadlocks
and race conditions in parallel programming.
And about memory safety of Delphi and Freepascal, here is what i said:
I have just read the following webpage about memory safety:
Microsoft: 70 percent of all security bugs are memory safety issues
https://www.zdnet.com/article/microsoft-70-percent-of-all-security-bugs-are-memory-safety-issues/
And it says:
"Users who often read vulnerability reports come across terms over and
over again. Terms like buffer overflow, race condition, page fault, null pointer, stack exhaustion, heap exhaustion/corruption, use after free,
or double free --all describe memory safety vulnerabilities."
So as you will notice below, that the following memory safety problems
has been solved in Delphi:
And I have just read the following webpage about "Fearless Security:
Memory safety":
https://hacks.mozilla.org/2019/01/fearless-security-memory-safety/
Here is the memory safety problems:
1- Misusing Free (use-after-free, double free)
I have solved this in Delphi and Freepascal by inventing a "Scalable"
reference counting with efficient support for weak references. Read
below about it.
2- Uninitialized variables
This can be detected by the compilers of Delphi and Freepascal.
3- Dereferencing Null pointers
I have solved this in Delphi and Freepascal by inventing a "Scalable"
reference counting with efficient support for weak references. Read
below about it.
4- Buffer overflow and underflow
This has been solved in Delphi by using madExcept, read here about it:
http://help.madshi.net/DebugMm.htm
You can buy it from here:
http://www.madshi.net/
There remains also the stack exhaustion memory safety problem,
and here is how to detect it in Delphi:
Call the function "DoStackOverflow" below once from your code and you'll
get the EStackOverflow error raised by Delphi with the message "stack overflow", and you can print the line of the source code where
EStackOverflow is raised with JCLDebug and such:
----
function DoStackOverflow : integer;
begin
result := 1 + DoStackOverflow;
end;
---
About my scalable algorithms inventions..
I am a white arab, and i am a gentleman type of person,
and i think that you know me too by my poetry that i wrote
in front of you and that i posted here, but i am
also a more serious computer developer, and i am also
an inventor who has invented many scalable algorithms, read about
them on my writing below:
Here is my last scalable algorithm invention, read
what i have just responded in comp.programming.threads:
About my LRU scalable algorithm..
On 10/16/2019 7:48 AM, Bonita Montero on comp.programming.threads wrote:
Amine, a quest for you:
Database-servers and operating-system-kernels mostly use LRU as
the scheme to evict old buffers from their cache. One issue with
LRU is, that an LRU-structure can't be updated by multiple threads simultaneously. So you have to have a global lock.
I managed to write a LRU-caching-class that can update the links
in the LRU-list to put the most recent fetched block to the h