• Read again about more of my philosophy about 3D stacking in CPUs and mo

    From Amine Moulay Ramdane@21:1/5 to All on Sun Sep 12 12:20:04 2021
    Hello,


    Read again about more of my philosophy about 3D stacking in CPUs and more..

    I am a white arab from Morocco, and i think i am smart since i have also invented many scalable algorithms and algorithms..

    3D stacking offers an extension for Moore’s Law, but in 3D stacking Heat removal is the issue and the big problem, this is why the actual
    technologies like the 3D stacking of Intel are limited to stacking just two or few layers.

    More of my philosophy about Moore’s Law and EUV (Extreme ultraviolet lithography)..

    Researchers have proposed successors to EUV, including e-beam and nanoimprint lithography, but hace not found any of them to be reliable enough to justify substantial investment.

    And I think by also using EUV (Extreme ultraviolet lithography) to create CPUs we will extend Moore's law by around 15 years that corresponds to around 100x scalability in performance, and i think that it is the same performance of 100x as the following
    invention from graphene:

    About graphene and about unlocking Moore’s Law..

    I think that graphene can now be mass produced, you can read about it here:

    We May Finally Have a Way of Mass Producing Graphene

    It's as simple as one, two, three.

    Read more here:

    https://futurism.com/we-may-finally-have-a-way-of-mass-producing-graphene

    So the following invention will be possible:

    Physicists Create Microchip 100 Times Faster Than Conventional Ones

    Read more here:

    https://interestingengineering.com/graphene-microchip-100-times-fast?fbclid=IwAR3wG09QxtQciuku4KUGBVRQPNRSbhnodPcnDySLWeXN9RCnvb0GqRAyM-4

    More philosophy about the Microchips that are 100 Times or 1000 times Faster Than Conventional Ones..

    I think that the following invention of Microchips that are 100 Times
    or 1000 times Faster Than Conventional Ones has its weakness, since
    its weakness is cache-coherence traffic between cores that
    takes time, so i think that they are speaking about 100-times
    or 1000-times more speed in a single core performance, so
    parallelism is still necessary and you need scalable algorithms
    for that so that to scale much more on multicores CPUs..

    Physicists Create Microchip 100 Times Faster Than Conventional Ones

    Read more here:

    https://interestingengineering.com/graphene-microchip-100-times-fast?fbclid=IwAR3wG09QxtQciuku4KUGBVRQPNRSbhnodPcnDySLWeXN9RCnvb0GqRAyM-4


    And read the following news:


    AMD Demonstrates Stacked 3D V-Cache Technology: 192 MB at 2 TB/sec which would technically be faster than the L1 cache on the die (but with higher latency)..

    "The AMD team surprised us here. What seemed like a very par-for-the-course Computex keynote turned into an incredible demonstration of what AMD is testing in the lab with TSMC’s new 3D Fabric technologies. We’ve covered 3D Fabric before, but AMD is
    putting it to good use by stacking up its processors with additional cache, enabling super-fast bandwidth, and better gaming performance."

    Read more here:

    https://www.anandtech.com/show/16725/amd-demonstrates-stacked-vcache-technology-2-tbsec-for-15-gaming


    More of my philosophy about the knee of an M/M/n queue and more..

    Here is the mathematical equation of the knee of an M/M/n queue in queuing theory in operational research:

    1/(n+1)^1/n

    n is the number of servers.

    So then an M/M/1 has a knee of 50% of the utilization, and the one of
    an M/M/2 is 0,578, so i correct below:

    More of my philosophy about the network topology in multicores CPUs..

    I invite you to look at the following video:

    Ring or Mesh, or other? AMD's Future on CPU Connectivity

    https://www.youtube.com/watch?v=8teWvMXK99I&t=904s

    And i invite you to read the following article:

    Does an AMD Chiplet Have a Core Count Limit?

    Read more here:

    https://www.anandtech.com/show/16930/does-an-amd-chiplet-have-a-core-count-limit

    I think i am smart and i say that the above video and the above article are not so smart, so i will talk about a very important thing, and it is the following, read the following:

    Performance Scalability of a Multi-core Web Server

    https://www.researchgate.net/publication/221046211_Performance_scalability_of_a_multi-core_web_server

    So notice carefully that it is saying the following:

    "..we determined that performance scaling was limited by the capacity of the address bus, which became saturated on all eight cores. If this key obstacle is addressed, commercial web server and systems software are well-positioned to scale to a large
    number of cores."

    So as you notice they were using an Intel Xeon of 8 cores, and the application was scalable to 8x but the hardware was not scalable to 8x,
    since it was scalable only to 4.8x, and this was caused by the bus
    saturation, since the Address bus saturation causes poor scaling, and the Address Bus carries requests and responses for data, called snoops, and more caches mean more sources and more destinations for snoops that is causing the poor scaling, so as you
    notice that a network topology of a Ring bus or a bus was not sufficient so that to scale to 8x on an Intel Xeon with 8 cores, so i think that the new architectures like Epyc CPU and Threadripper CPU can use a faster bus or/and a different network
    topology that permits to both ensure a full scalability locally in the same node and globally between the nodes, so then we can notice that a sophisticated mesh network topology not only permits to reduce the number of hops inside the CPU for good
    latency, but it is also good for reliability by using its sophisticated redundancy and it is faster than previous topologies like the ring bus or the bus since
    for example the search on address bus becomes parallelized, and it looks like the internet network that uses mesh topology using routers, so it parallelizes, and i also think that using a more sophisticated topology like a mesh network topology is
    related to queuing theory since we can notice that in operational research the mathematics says that we can make the queue like M/M/1 more efficient by making the server more powerful, but we can notice that
    the knee of a M/M/1 queue is around 50% , so we can notice that
    by using in a mesh topology like internet or inside a CPU you can
    by parallelizing more you can in operational research both enhance the knee of the queue and the speed of executing the transactions and it is like using many servers in queuing theory and it permits to scale better inside a CPU or in internet.


    Thank you,
    Amine Moulay Ramdane.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)