Forum: >>> Magnum BBS <<<

More of my philosophy about latency and contention and concurrency and

From Amine Moulay Ramdane@21:1/5 to All on Sat Jul 23 13:39:08 2022

Hello,

More of my philosophy about latency and contention and concurrency
and parallelism and more of my thoughts..

I am a white arab from Morocco, and i think i am smart since i have also invented many scalable algorithms and algorithms..

I think i am highly smart and i have just posted, read it below,
about the new two inventions that will make logic gates thousands of times faster or a million times faster than those in existing computers,
and i think that there is still a problem with those inventions,
and it is about the latency and concurrency, since you need concurrency
and you need premptive or non-premptive scheduling of the coroutines ,
so since the HBM is 106.7 ns in latency and the DDR4 is 73.3 ns in latency and the AMD 3D V-Cache has also almost the same cost in latency, so as you notice that this kind of latency is still costly , and it is a waiting time that looks like the time
wasted in a contention in parallelism, so by logical analogy this kind of latency creates like a contention like in parallelism that reduces scalability, so i think it is why those inventions have this kind of limit or constraints.

And i invite you to read my following smart thoughts about about preemptive and non-preemptive timesharing:

https://groups.google.com/g/alt.culture.morocco/c/JuC4jar661w

More of my philosophy about Fastest-ever logic gates and more of my thoughts..

I am a white arab from Morocco, and i think i am smart since i have also invented many scalable algorithms and algorithms..

"Logic gates are the fundamental building blocks of computers, and researchers at the University of Rochester have now developed the fastest ones ever created. By zapping graphene and gold with laser pulses, the new logic gates are a million times faster
than those in existing computers, demonstrating the viability of “lightwave electronics.”. If these kinds of lightwave electronic devices ever do make it to market, they could be millions of times faster than today’s computers. Currently we measure
processing speeds in Gigahertz (GHz), but these new logic gates function on the scale of Petahertz (PHz). Previous studies have set that as the absolute quantum limit of how fast light-based computer systems could possibly get."

Read more here:

https://newatlas.com/electronics/fastest-ever-logic-gates-computers-million-times-faster-petahertz/

Read my following news:

And with the following new discovery computers and phones could run thousands of times faster..

Prof Alan Dalton in the School of Mathematical and Physics Sciences at the University of Sussex, said:

"We're mechanically creating kinks in a layer of graphene. It's a bit like nano-origami.

"Using these nanomaterials will make our computer chips smaller and faster. It is absolutely critical that this happens as computer manufacturers are now at the limit of what they can do with traditional semiconducting technology. Ultimately, this will
make our computers and phones thousands of times faster in the future.

"This kind of technology -- "straintronics" using nanomaterials as opposed to electronics -- allows space for more chips inside any device. Everything we want to do with computers -- to speed them up -- can be done by crinkling graphene like this."

Dr Manoj Tripathi, Research Fellow in Nano-structured Materials at the University of Sussex and lead author on the paper, said:

"Instead of having to add foreign materials into a device, we've shown we can create structures from graphene and other 2D materials simply by adding deliberate kinks into the structure. By making this sort of corrugation we can create a smart electronic
component, like a transistor, or a logic gate."

The development is a greener, more sustainable technology. Because no additional materials need to be added, and because this process works at room temperature rather than high temperature, it uses less energy to create.

Read more here:

https://www.sciencedaily.com/releases/2021/02/210216100141.htm

But I think that mass production of graphene still hasn't quite begun,
so i think the inventions above of the Fastest-ever logic gates that
uses graphene and of the one with nanomaterials that uses graphene will not be commercialized fully until perhaps around year 2035 or 2040 or so, so read the following so that to understand why:

"Because large-scale mass production of graphene still hasn't quite begun , the market is a bit limited. However, that leaves a lot of room open for investors to get in before it reaches commercialization.

The market was worth $78.7 million in 2019 and, according to Grand View Research, is expected to rise drastically to $1.08 billion by 2027.

North America currently has the bulk of market share, but the Asia-Pacific area is expected to have the quickest growth in adoption of graphene uses in coming years. North America and Europe are also expected to have above-market average growth.

The biggest driver of all this growth is expected to be the push for cleaner, more efficient energy sources and the global reduction of emissions in the air."

Read more here:

https://www.energyandcapital.com/report/the-worlds-next-rare-earth-metal/1600

More of my philosophy about Artificial intelligence and more of my thoughts..

I think i am highly smart, and i think that we will attain a high level artificial intelligence in year 2029, so my prediction is that in year 2029 it will become a much more powerful artificial intelligence that will permit us to enhance much more our
human lives, and i also think that by scaling up a giant transformer trained on sequences of tokenized inputs will become a much more powerful artificial intelligence, you can read more about it on my following thoughts in the following web link:

And you can read my thoughts about artificial intelligence and productivity and about China and its artificial intelligence and computer chips in the following web link:

https://groups.google.com/g/alt.culture.morocco/c/UOt_4qTgN8M

And you can read my thoughts about the next industrial revolution and about Exascale supercomputers and more in the following web link:

https://groups.google.com/g/alt.culture.morocco/c/hT6faP8cndE

And you can read my following thoughts about 3D stacking in CPUs and about EUV (extreme ultra violet) and about scalability and more in the
following web link:

https://groups.google.com/g/alt.culture.morocco/c/USMMhMB9WIE

And you can read my following thoughts about Nanotechnology and about Exponential Progress in the following web link:

https://groups.google.com/g/alt.culture.morocco/c/mjE_2AG1TKQ

And US plans exascale supercomputers 5-10x more powerful than Frontier
and targets 100 exaflops after 2030, and China is believed to have secretly set up the world's first exascale supercomputer in 2021, soon following it up with a second system. As many as 10 exascale supercomputers are thought to be in development in the
country.

Read more here:

https://www.datacenterdynamics.com/en/news/us-plans-exascale-supercomputers-5-10x-more-powerful-than-frontier/

Exascale supercomputers will also allow to construct an accurate map
of the brain that allows to "reverse" engineer or understand the brain,
read the following so that to notice it:

“If we don’t improve today’s technology, the compute time for a whole mouse brain would be something like 1,000,000 days of work on current supercomputers. Using all of Aurora 2 exaFLOPS supercomputer, if everything worked beautifully, it could still take 1,000 days.” Nicola Ferrier, Argonne senior computer scientist

Read more here so that to understand:

https://www.anl.gov/article/preparing-for-exascale-argonnes-aurora-supercomputer-to-drive-brain-map-construction

Also Exascale supercomputers will allow researchers to tackle problems
which were impossible to simulate using the previous generation of
machines, due to the massive amounts of data and calculations involved.

Small modular nuclear reactor (SMR) design, wind farm optimization and
cancer drug discovery are just a few of the applications that are
priorities of the U.S. Department of Energy (DOE) Exascale Computing
Project. The outcomes of this project will have a broad impact and
promise to fundamentally change society, both in the U.S. and abroad.

Read more here:

https://www.cbc.ca/news/opinion/opinion-exascale-computing-1.5382505

Also the goal of delivering safe, abundant, cheap energy from fusion is
just one of many challenges in which exascale computing’s power may
prove decisive. That’s the hope and expectation. Also to know more about
the other benefits of using Exascale computing power, read more here:

https://www.hpcwire.com/2019/05/07/ten-great-reasons-among-many-more-to-build-the-1-5-exaflops-frontier/

And more of my philosophy about quantum computers and about how Machine Learning gets a quantum speedup and more of my thoughts..

IBM says thousands of quantum computers will be on sale by 2025, with 4,000 qubits, It represents a leap from its current hardware of 127 qubits

Read more here:

https://hpc-developpez-com.translate.goog/actu/333533/IBM-affirme-que-des-milliers-d-ordinateurs-quantiques-seront-en-vente-d-ici-2025-avec-4-000-qubits-il-represente-un-bond-par-rapport-a-son-materiel-actuel-de-127-qubits/?_x_tr_sl=auto&_x_tr_tl=en&_x_
tr_hl=en

I think i am smart, and as you have just noticed, i have just talked about quantum computers, read my below thoughts about it, and now we have to also ask how Machine Learning gets a quantum speedup, and here is an interesting article about it so that
you understand that quantum computers are also very interesting to have:

https://www.quantamagazine.org/ai-gets-a-quantum-computing-speedup-20220204/

And IBM is releasing the roadmap that we think will take us from the noisy, small-scale quantum computers of today to the million-plus qubit quantum computers of the future. And IBM team is developing a suite of scalable, increasingly larger and better
processors, with a 1,000-plus qubit device, called IBM Quantum Condor, targeted for the end of 2023,
you can read about it on the following interesting article:

https://research.ibm.com/blog/ibm-quantum-roadmap

And read my previous thoughts:

More of my philosophy about quantum computers and about CPUs and more..

For a parallel computer, we need to have one billion different processors. In a quantum computer, a single register can perform a billion computations since a qubit of a register of a quantum computer can be both in two states 1 and 0, this is known as
quantum parallelism, so, what characteristics do the problems where quantum computing wins big share? i think one thing that they share is that they tend to be about some global property of a large mathematical system, and connecting quantum computing to
"Moore's Law" is sort of foolish -- it's not an all-purpose technique for faster computers, but a limited technique that makes certain types of specialized problems easier, while leaving most of the things we actually use computers for unaffected.

So i think that classical computers are also really useful,
so read my following thoughts so that to understand:

More of my philosophy about Intel Thread Director and about CPUs and more..

I invite you to read the following interesting article about Intel Thread Director:

How Intel Thread Director makes Alder Lake and Windows 11 a match made in heaven

https://www.digitaltrends.com/computing/how-intel-thread-director-marries-alder-lake-windows-11/

And more of my philosophy about AVX-512 and about Delphi 11.1 and more
of my thoughts..

I am also using Delphi and Freepascal compilers, and the new Delphi 11.1 compiler provides inline assembler (asm code) support for newer sets of instructions, including AVX2 (ymm) and AVX512 (zmm), you can read about it here:

https://lecturepress.com/tech-journal/dev-tools/delphi-11-is-released/

And the new Delphi 11.1 is here..

Build Native Apps 5x Faster With One Codebase
For Windows, Android, iOS, macOS, and Linux

You can read more about it here:

https://www.embarcadero.com/products/delphi

More of my philosophy about AMD Zen 4 and more of my thoughts..

I have just forgotten to talk about AVX-512 support in Zen 4,
and Zen 4 will be out in 2022, so i think that Zen 4 will support AVX-512 and that's a good news, and you can read about it here:

Gigabyte Leaks AMD Zen 4 Details: 5nm, AVX-512, 96 Cores, 12-Channel DDR5

Read more here:

https://www.extremetech.com/computing/325888-gigabyte-leaks-amd-zen-4-details-5nm-avx-512-96-cores-12-channel-ddr5

And read my following previous thoughts:

More of my philosophy about Intel's Alder Lake and about ARM and x86 memory models and more of my thoughts..

I think i am smart, and as you have just noticed, i have just talked about Epyc Zen 4 and Zen 5 and i have just talked about the network topology inside multicore CPUs etc. read them in my thoughts below, and i think that my talking about the network
topology of multicore CPUs will still be valid if the new Intel's Alder Lake also becomes a server CPU like a Xeon or Epyc, so here is my thoughts about Intel Alder Lake and about ARM and x86 memory models:

I think that the new Intel's Alder Lake is a winner, and i think that
the performance/efficiency core design of Intel's Alder Lake could find its way into servers, workstations, or embedded IT systems as you can notice it by reading the following article:

https://www.networkworld.com/article/3631072/will-intels-new-desktop-cpu-design-come-to-its-xeon-server-chips.html

More of my philosophy about the ARM and x86 memory models and more
of my thoughts..

I think i am smart, and as you have just noticed i have just said
that x86 is the future(read my below thoughts so that to understand why)
, but i think that ARM architecture has another big defect, since its weak hardware memory model has not balanced correctly between safety or security and performance, so i think that it is a big defect in ARM, read carefully the following article about
x86 TSO memory model:

https://research.swtch.com/hwmm

So notice that Intel says that it has well balanced between safety or security and performance by saying the following:

"To address these problems, Owens et al. proposed the x86-TSO model, based on the earlier SPARCv8 TSO model. At the time they claimed that “To the best of our knowledge, x86-TSO is sound, is strong enough to program above, and is broadly in line with
the vendors’ intentions.” A few months later Intel and AMD released new manuals broadly adopting this model."

And read more here so that to understand that x86 TSO memory model is very good:

https://jakob.engbloms.se/archives/1435

So i think that ARM has a big defect since it has to provide
with TSO memory model as RISC-V is providing it, since it is
very important for the security or safety concerns

More of my philosophy about the fight between x86 and ARM architectures and more of my thoughts..

I invite you to read the following interesting article about the
fight between x86 or x64 and ARM architectures

ARM Servers on AWS: How to Save up to 30%

Read more here:

https://opsworks.co/arm-servers-on-aws-how-to-save-up-to-30/

So notice that it says the following about ARM CPU architecture compared to x86 CPU architecture:

"Running in a standard setting, Graviton2 performs 20% better, and the power consumption of the Arm core is about half that of other types of cores. Since the cost savings are also about 20%, performance-cost improvements reach 40%."

But i think that the new Intel's Alder Lake is a new winner, since
read the following article so that to notice:

Intel's Alder Lake chip could speed PCs by 30% while saving battery power

https://www.cnet.com/tech/computing/intels-alder-lake-chip-could-speed-pcs-by-30-while-saving-battery-power/

Also here is the other way that is using Intel so that to fight ARM:

Intel CEO says co-designed x86 chips will fend off Arm threat

Read more here:

https://www.pcgamer.com/intel-x86-vs-arm-gelsinger/

So i think that x86 architecture is the future.

And you can read my following thoughts about 3D stacking in CPUs and about EUV (extreme ultra violet) and about scalability and more in the
following web link:

https://groups.google.com/g/alt.culture.morocco/c/USMMhMB9WIE

More of my philosophy about the next Epyc Zen 4 and Epyc Zen 5 CPUs and more of my thoughts..

I have just read the following article about the next AMD EPYC Turin Zen 5 CPUs Rumored To Feature Up To 256 Cores & 192 Core:

https://wccftech.com/amd-epyc-turin-zen-5-cpus-rumored-to-feature-up-to-256-cores-192-core-configurations-max-600w-configurable-tdps/

And notice the data in the above article, so i can say the following
with my calculations:

DDR5 will arrive with a minimum speed of at 4800Mbit/s, which works out to 76.8GB/s of bandwidth in a dual-channel configuration,
and each CCX in Epyc Zen 4 and Zen 4C can be enabled as its own NUMA domain, so in the next AMD EPYC Genoa and AMD EPYC Bergamo CPUs there will be 12 NUMA nodes per socket, with respectively DDR5-5200 and DDR5-5600 support on those CPUs, so the AMD EPYC
Genoa can support a memory bandwidth of 5.2 GT/s x 8 bytes per channel x 12 channels for one socket, and that equals 499.2 GB per second or 998.4 GB per second for two sockets, and the AMD EPYC Bergamo can support a memory bandwidth of 5.6 GT/s x 8 bytes
per channel x 12 channels for one socket, that equals 537.6 GB per second or 1075.2 GB per second for two sockets, so as you notice that the memory bandwidth will become powerful on those kind of CPUs of Zen 4 and Zen 5, and the IPC gain from Zen 3 to
Zen 4 is at around 20% and 40% Overall Performance Boost of Zen 4 over Zen 3, and Zen 5 will have 20-40% IPC increase over Zen 4, and for the network topology in those next multicores CPUs, you can read my following thoughts about it:

More of my philosophy about the knee of an M/M/n queue and more..

Here is the mathematical equation of the knee of an M/M/n queue in
queuing theory in operational research:

1/(n+1)^1/n

n is the number of servers.

So then an M/M/1 has a knee of 50% of the utilization, and the one of
an M/M/2 is 0,578.

More of my philosophy about the network topology in multicores CPUs..

I invite you to look at the following video:

Ring or Mesh, or other? AMD's Future on CPU Connectivity

https://www.youtube.com/watch?v=8teWvMXK99I&t=904s

And i invite you to read the following article:

Does an AMD Chiplet Have a Core Count Limit?

Read more here:

https://www.anandtech.com/show/16930/does-an-amd-chiplet-have-a-core-count-limit

I think i am smart and i say that the above video and the above article
are not so smart, so i will talk about a very important thing, and it is
the following, read the following:

Performance Scalability of a Multi-core Web Server

https://www.researchgate.net/publication/221046211_Performance_scalability_of_a_multi-core_web_server

So notice carefully that it is saying the following:

"..we determined that performance scaling was limited by the capacity of
the address bus, which became saturated on all eight cores. If this key obstacle is addressed, commercial web server and systems software are well-positioned to scale to a large number of cores."

So as you notice they were using an Intel Xeon of 8 cores, and the
application was scalable to 8x but the hardware was not scalable to 8x,
since it was scalable only to 4.8x, and this was caused by the bus
saturation, since the Address bus saturation causes poor scaling, and
the Address Bus carries requests and responses for data, called snoops,
and more caches mean more sources and more destinations for snoops that
is causing the poor scaling, so as you notice that a network topology of
a Ring bus or a bus was not sufficient so that to scale to 8x on an
Intel Xeon with 8 cores, so i think that the new architectures like Epyc
CPU and Threadripper CPU can use a faster bus or/and a different network topology that permits to both ensure a full scalability locally in the
same node and globally between the nodes, so then we can notice that a sophisticated mesh network topology not only permits to reduce the
number of hops inside the CPU for good latency, but it is also good for reliability by using its sophisticated redundancy and it is faster than previous topologies like the ring bus or the bus since
for example the search on address bus becomes parallelized, and it looks
like the internet network that uses mesh topology using routers, so it parallelizes, and i also think that using a more sophisticated topology
like a mesh network topology is related to queuing theory since we can
notice that in operational research the mathematics says that we can
make the queue like M/M/1 more efficient by making the server more
powerful, but we can notice that the knee of a M/M/1 queue is around 50%
, so we can notice that by using a mesh topology like internet or
inside a CPU, you can by parallelizing more you can in operational
research both enhance the knee of the queue and the speed of executing
the transactions and it is like using many servers in queuing theory and
it permits to scale better inside a CPU or in internet.

More of my philosophy about DDR5 and the next Sapphire Rapids CPU of Intel and more of my thoughts..

I will explain something very important:

I invite you to read the following about the next Sapphire Rapids CPU of Intel here:

Intel Provides Details About Sapphire Rapids CPU and Ponte Vecchio GPU

https://www.hpcwire.com/off-the-wire/intel-unveils-details-about-sapphire-rapids-cpu-ponte-vecchio-gpu-ipu/

So notice carefully that it says the following:

"The processor is built to drive industry technology transitions with advanced memory and next generation I/O, including PCIe 5.0, CXL 1.1, DDR5 and HBM technologies."

And notice that it says the same here:

https://en.wikipedia.org/wiki/Sapphire_Rapids

So the next Sapphire Rapids CPU of Intel will support DDR5 and HBM technologies for the memory subsystem, but i will say that CPUs like the kind of CPUs for computer servers have implemented ECC in their caches for at least a decade or so, and DDR5
memory subsystem implementations are useful for creating large capacities with modest bandwidth compared to HBM, and HBM, on the other hand, offers large bandwidth with low capacity, but i think that the problem with the next Sapphire Rapids CPU of Intel
is that DDR5 has a problem that it is not fully ECC, read here to notice it:

"On-die ECC: The presence of on-die ECC on DDR5 memory has been the subject of many discussions and a lot of confusion among consumers and the press alike. Unlike standard ECC, on-die ECC primarily aims to improve yields at advanced process nodes,
thereby allowing for cheaper DRAM chips. On-die ECC only detects errors if they take place within a cell or row during refreshes. When the data is moved from the cell to the cache or the CPU, if there’s a bit-flip or data corruption, it won’t be
corrected by on-die ECC. Standard ECC corrects data corruption within the cell and as it is moved to another device or an ECC-supported SoC."

Read more here to notice it:

https://www.hardwaretimes.com/ddr5-vs-ddr4-ram-quad-channel-and-on-die-ecc-explained/

More of my philosophy about HP NonStop to x86 Server Platform fault-tolerant computer systems and more..

Now HP to Extend HP NonStop to x86 Server Platform

HP announced in 2013 plans to extend its mission-critical HP NonStop technology to x86 server architecture, providing the 24/7 availability required in an always-on, globally connected world, and increasing customer choice.

Read the following to notice it:

https://www8.hp.com/us/en/hp-news/press-release.html?id=1519347#.YHSXT-hKiM8

And today HP provides HP NonStop to x86 Server Platform, and here is
an example, read here:

https://www.hpe.com/ca/en/pdfViewer.html?docId=4aa5-7443&parentPage=/ca/en/products/servers/mission-critical-servers/integrity-nonstop-systems&resourceTitle=HPE+NonStop+X+NS7+%E2%80%93+Redefining+continuous+availability+and+scalability+for+x86+data+sheet

So i think programming the HP NonStop for x86 is now compatible with x86 programming.

Thank you,
Amine Moulay Ramdane.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Bob Worm
  Fri Apr 26 15:47:21 2024
  from Wales, Uk via Telnet
- Bob Worm
  Fri Apr 26 10:09:36 2024
  from Wales, Uk via Telnet
- Bob Worm
  Sat Apr 27 15:58:57 2024
  from Wales, Uk via Telnet
- Brad Hines
  Sat Apr 27 09:13:13 2024
  from Pasadena, Ca via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	297
Nodes:	16 (2 / 14)
Uptime:	104:07:28
Calls:	6,660
Calls today:	2
Files:	12,209
Messages:	5,335,170

More of my philosophy about latency and contention and concurrency and

Who's Online

Recent Visitors

System Info