• Anticipating processor architectural evolution

    From Don Y@21:1/5 to All on Sat Apr 27 16:11:30 2024
    I've had to refactor my RTOS design to accommodate the likelihood of SMT
    in future architectures.

    Thinking (hoping?) these logical cores to be the "closest to the code",
    I call them "Processors" (hysterical raisins). Implicit in SMT is the
    notion that they are architecturally similar/identical.

    These are part of PHYSICAL cores -- that I appropriately call "Cores".

    These Cores are part of "Hosts" (ick; term begs for clarity!)... what
    one would casually call "chips"/CPUs. Note that a host can house dissimilar Cores (e.g., big.LITTLE).

    Two or more hosts can be present on a "Node" (the smallest unit intended to
    be added to or removed from a "System"). Again, they can be dissimilar
    (think CPU/GPU).

    I believe this covers the composition/hierarchy of any (near) future
    system architectures. And, places the minimum constraints on said.

    Are there any other significant developments in the pipeline that
    could alter my conception of future hardware designs?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Larkin@21:1/5 to blockedofcourse@foo.invalid on Sat Apr 27 19:52:55 2024
    On Sat, 27 Apr 2024 16:11:30 -0700, Don Y
    <blockedofcourse@foo.invalid> wrote:

    I've had to refactor my RTOS design to accommodate the likelihood of SMT
    in future architectures.

    Thinking (hoping?) these logical cores to be the "closest to the code",
    I call them "Processors" (hysterical raisins). Implicit in SMT is the
    notion that they are architecturally similar/identical.

    These are part of PHYSICAL cores -- that I appropriately call "Cores".

    These Cores are part of "Hosts" (ick; term begs for clarity!)... what
    one would casually call "chips"/CPUs. Note that a host can house dissimilar >Cores (e.g., big.LITTLE).

    Two or more hosts can be present on a "Node" (the smallest unit intended to >be added to or removed from a "System"). Again, they can be dissimilar >(think CPU/GPU).

    I believe this covers the composition/hierarchy of any (near) future
    system architectures. And, places the minimum constraints on said.

    Are there any other significant developments in the pipeline that
    could alter my conception of future hardware designs?

    Why not hundreds of CPUs on a chip, each assigned to one function,
    with absolute hardware protection? They need not be identical, because
    many would be assigned to simple functions.

    The mess we have now is the legacy of thinking about a CPU as some
    precious resource.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bill Sloman@21:1/5 to John Larkin on Sun Apr 28 13:48:19 2024
    On 28/04/2024 12:52 pm, John Larkin wrote:
    On Sat, 27 Apr 2024 16:11:30 -0700, Don Y
    <blockedofcourse@foo.invalid> wrote:

    I've had to refactor my RTOS design to accommodate the likelihood of SMT
    in future architectures.

    Thinking (hoping?) these logical cores to be the "closest to the code",
    I call them "Processors" (hysterical raisins). Implicit in SMT is the
    notion that they are architecturally similar/identical.

    These are part of PHYSICAL cores -- that I appropriately call "Cores".

    These Cores are part of "Hosts" (ick; term begs for clarity!)... what
    one would casually call "chips"/CPUs. Note that a host can house dissimilar >> Cores (e.g., big.LITTLE).

    Two or more hosts can be present on a "Node" (the smallest unit intended to >> be added to or removed from a "System"). Again, they can be dissimilar
    (think CPU/GPU).

    I believe this covers the composition/hierarchy of any (near) future
    system architectures. And, places the minimum constraints on said.

    Are there any other significant developments in the pipeline that
    could alter my conception of future hardware designs?

    Why not hundreds of CPUs on a chip, each assigned to one function,
    with absolute hardware protection? They need not be identical, because
    many would be assigned to simple functions.

    The mess we have now is the legacy of thinking about a CPU as some
    precious resource.

    The "mess" we have now reflects the fact that we are less constrained
    than we used to be.

    As soon as you could do multi-threaded processing life became more
    complicated, but you could do a great deal more.

    Anything complicated will look like a mess if you don't understand
    what's going on - and if you aren't directly involved why would you
    bother to do the work that would let you understand what was going on?

    If would be nice if we could find some philosophical high ground from
    which all the various forms of parallel processing could be sorted into
    a coherent taxonomy, but the filed doesn't seem to have found its Carl
    Linnaeus yet.


    --
    Bill Sloman, Sydney

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From boB@21:1/5 to jjSNIPlarkin@highNONOlandtechnology on Mon Apr 29 12:19:40 2024
    On Sat, 27 Apr 2024 19:52:55 -0700, John Larkin <jjSNIPlarkin@highNONOlandtechnology.com> wrote:

    On Sat, 27 Apr 2024 16:11:30 -0700, Don Y
    <blockedofcourse@foo.invalid> wrote:

    I've had to refactor my RTOS design to accommodate the likelihood of SMT
    in future architectures.

    Thinking (hoping?) these logical cores to be the "closest to the code",
    I call them "Processors" (hysterical raisins). Implicit in SMT is the >>notion that they are architecturally similar/identical.

    These are part of PHYSICAL cores -- that I appropriately call "Cores".

    These Cores are part of "Hosts" (ick; term begs for clarity!)... what
    one would casually call "chips"/CPUs. Note that a host can house dissimilar >>Cores (e.g., big.LITTLE).

    Two or more hosts can be present on a "Node" (the smallest unit intended to >>be added to or removed from a "System"). Again, they can be dissimilar >>(think CPU/GPU).

    I believe this covers the composition/hierarchy of any (near) future
    system architectures. And, places the minimum constraints on said.

    Are there any other significant developments in the pipeline that
    could alter my conception of future hardware designs?

    Why not hundreds of CPUs on a chip, each assigned to one function,
    with absolute hardware protection? They need not be identical, because
    many would be assigned to simple functions.


    Isn't this what Waferscale is, kinda ?

    boB



    The mess we have now is the legacy of thinking about a CPU as some
    precious resource.



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to boB on Mon Apr 29 15:03:57 2024
    On 4/29/2024 12:19 PM, boB wrote:
    Isn't this what Waferscale is, kinda ?

    WSI has proven to be a dead-end (for all but specific niche markets
    and folks with deep pockets). "Here lie The Connection Machine,
    The Transputer, etc."

    Until recently, there haven't really been any mainstream uses for
    massively parallel architectures (GPUs being the first real use and
    their preemption for use in AI and Expert Systems)

    To exploit an array of identical processors you typically need a
    problem that can be decomposed into many "roughly comparable" (in
    terms of complexity) tasks that have few interdependencies.

    Most problems are inherently serial and/or have lots of dependencies
    that limit the amount of true parallelism that can be attained.
    Or, have widely differing resource needs/complexity to make them
    ill suited to being shoe-horned into a one-size-fits-all processor
    model. E.g., controlling a motor and recognizing faces have
    vastly different computational requirements.

    Communication is always the bottleneck in a processing application;
    whether it be CPU to memory, task to task, thread to thread, etc.
    It's also one of the ripest areas for bugs to creep into a design;
    designing good "seams" (interfaces) is the biggest predictor of
    success in any project of significance (that's why we have protection
    domains, preach small modules, well defined interfaces, "contract"
    programming style).

    Sadly, few folks are formally taught about these interrelationships
    (when was the last time you saw a Petri net?) so we have lots of
    monolithic designs that are brittle due to having broken all the
    Best Practices rules.

    The smarter way of tackling increasingly complex problems is better partitioning of hardware resources (with similarly architected
    software atop) using FIFTY YEAR OLD protection mechanisms to enforce
    the boundaries between "virtual processors".

    This allows a processor having the capabilities required by the most
    demanding "component" to be leveraged to, also, handle the needs of
    those of lesser complexity. It also gives you a speedy way of exchanging information between those processors without requiring specialize
    fabric for that task.

    And, that SHARED mechanism is easily snooped to see who is talking to
    whom (as well as prohibiting interactions that *shouldn't* occur!)

    E.g., I effectively allow for the creation of virtual processors of
    specific capabilities and resource allocations AS IF they were discrete hardware units interconnected by <something>. This lets me dole out
    the fixed resources (memory, MIPS, time, watts) in the box to specific
    uses and have "extra" for uses that require them.

    (I can set a virtual processor to only have access to 64KB! -- or 16K
    or 16MB -- of memory, only allow it to execute a million opcode fetches
    per second, etc. and effectively have a tiny 8b CPU emulated within a
    much more capable framework. And, not be limited to moving data via
    a serial port to other such processors!)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From john larkin@21:1/5 to blockedofcourse@foo.invalid on Mon Apr 29 15:32:47 2024
    On Sat, 27 Apr 2024 16:11:30 -0700, Don Y
    <blockedofcourse@foo.invalid> wrote:

    I've had to refactor my RTOS design to accommodate the likelihood of SMT
    in future architectures.

    Thinking (hoping?) these logical cores to be the "closest to the code",
    I call them "Processors" (hysterical raisins). Implicit in SMT is the
    notion that they are architecturally similar/identical.

    These are part of PHYSICAL cores -- that I appropriately call "Cores".

    These Cores are part of "Hosts" (ick; term begs for clarity!)... what
    one would casually call "chips"/CPUs. Note that a host can house dissimilar >Cores (e.g., big.LITTLE).

    Two or more hosts can be present on a "Node" (the smallest unit intended to >be added to or removed from a "System"). Again, they can be dissimilar >(think CPU/GPU).

    I believe this covers the composition/hierarchy of any (near) future
    system architectures. And, places the minimum constraints on said.

    Are there any other significant developments in the pipeline that
    could alter my conception of future hardware designs?

    Vaguely related:

    https://www.theregister.com/2023/10/30/arm_intel_comment/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Larkin@21:1/5 to blockedofcourse@foo.invalid on Mon Apr 29 17:17:35 2024
    On Mon, 29 Apr 2024 15:03:57 -0700, Don Y
    <blockedofcourse@foo.invalid> wrote:

    On 4/29/2024 12:19 PM, boB wrote:
    Isn't this what Waferscale is, kinda ?

    WSI has proven to be a dead-end (for all but specific niche markets
    and folks with deep pockets). "Here lie The Connection Machine,
    The Transputer, etc."

    Until recently, there haven't really been any mainstream uses for
    massively parallel architectures (GPUs being the first real use and
    their preemption for use in AI and Expert Systems)

    To exploit an array of identical processors you typically need a
    problem that can be decomposed into many "roughly comparable" (in
    terms of complexity) tasks that have few interdependencies.

    A PC doesn't solve massively parallel computational problems.

    One CPU can be a disk file server. One, a keyboard handler. One for
    the mouse. One can be the ethernet interface. One CPU for each
    printer. One would be the "OS", managing all the rest.

    Cheap CPUs can run idle much of the time.

    We don't need to share one CPU doing everything any more. We don't
    need virtual memory. If each CPU has a bit of RAM, we barely need
    memory management.





    Most problems are inherently serial and/or have lots of dependencies
    that limit the amount of true parallelism that can be attained.
    Or, have widely differing resource needs/complexity to make them
    ill suited to being shoe-horned into a one-size-fits-all processor
    model. E.g., controlling a motor and recognizing faces have
    vastly different computational requirements.

    Communication is always the bottleneck in a processing application;
    whether it be CPU to memory, task to task, thread to thread, etc.
    It's also one of the ripest areas for bugs to creep into a design;
    designing good "seams" (interfaces) is the biggest predictor of
    success in any project of significance (that's why we have protection >domains, preach small modules, well defined interfaces, "contract" >programming style).

    Sadly, few folks are formally taught about these interrelationships
    (when was the last time you saw a Petri net?) so we have lots of
    monolithic designs that are brittle due to having broken all the
    Best Practices rules.

    The smarter way of tackling increasingly complex problems is better >partitioning of hardware resources (with similarly architected
    software atop) using FIFTY YEAR OLD protection mechanisms to enforce
    the boundaries between "virtual processors".

    This allows a processor having the capabilities required by the most >demanding "component" to be leveraged to, also, handle the needs of
    those of lesser complexity. It also gives you a speedy way of exchanging >information between those processors without requiring specialize
    fabric for that task.

    And, that SHARED mechanism is easily snooped to see who is talking to
    whom (as well as prohibiting interactions that *shouldn't* occur!)

    E.g., I effectively allow for the creation of virtual processors of
    specific capabilities and resource allocations AS IF they were discrete >hardware units interconnected by <something>. This lets me dole out
    the fixed resources (memory, MIPS, time, watts) in the box to specific
    uses and have "extra" for uses that require them.

    (I can set a virtual processor to only have access to 64KB! -- or 16K
    or 16MB -- of memory, only allow it to execute a million opcode fetches
    per second, etc. and effectively have a tiny 8b CPU emulated within a
    much more capable framework. And, not be limited to moving data via
    a serial port to other such processors!)

    Why virtual processors, if real ones are cheap?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bill Sloman@21:1/5 to John Larkin on Tue Apr 30 14:40:24 2024
    On 30/04/2024 10:17 am, John Larkin wrote:
    On Mon, 29 Apr 2024 15:03:57 -0700, Don Y
    <blockedofcourse@foo.invalid> wrote:

    On 4/29/2024 12:19 PM, boB wrote:
    Isn't this what Waferscale is, kinda ?

    WSI has proven to be a dead-end (for all but specific niche markets
    and folks with deep pockets). "Here lie The Connection Machine,
    The Transputer, etc."

    Until recently, there haven't really been any mainstream uses for
    massively parallel architectures (GPUs being the first real use and
    their preemption for use in AI and Expert Systems)

    To exploit an array of identical processors you typically need a
    problem that can be decomposed into many "roughly comparable" (in
    terms of complexity) tasks that have few interdependencies.

    A PC doesn't solve massively parallel computational problems.

    One CPU can be a disk file server. One, a keyboard handler. One for
    the mouse. One can be the ethernet interface. One CPU for each
    printer. One would be the "OS", managing all the rest.

    Cheap CPUs can run idle much of the time.

    We don't need to share one CPU doing everything any more. We don't
    need virtual memory. If each CPU has a bit of RAM, we barely need
    memory management.





    Most problems are inherently serial and/or have lots of dependencies
    that limit the amount of true parallelism that can be attained.
    Or, have widely differing resource needs/complexity to make them
    ill suited to being shoe-horned into a one-size-fits-all processor
    model. E.g., controlling a motor and recognizing faces have
    vastly different computational requirements.

    Communication is always the bottleneck in a processing application;
    whether it be CPU to memory, task to task, thread to thread, etc.
    It's also one of the ripest areas for bugs to creep into a design;
    designing good "seams" (interfaces) is the biggest predictor of
    success in any project of significance (that's why we have protection
    domains, preach small modules, well defined interfaces, "contract"
    programming style).

    Sadly, few folks are formally taught about these interrelationships
    (when was the last time you saw a Petri net?) so we have lots of
    monolithic designs that are brittle due to having broken all the
    Best Practices rules.

    The smarter way of tackling increasingly complex problems is better
    partitioning of hardware resources (with similarly architected
    software atop) using FIFTY YEAR OLD protection mechanisms to enforce
    the boundaries between "virtual processors".

    This allows a processor having the capabilities required by the most
    demanding "component" to be leveraged to, also, handle the needs of
    those of lesser complexity. It also gives you a speedy way of exchanging
    information between those processors without requiring specialize
    fabric for that task.

    And, that SHARED mechanism is easily snooped to see who is talking to
    whom (as well as prohibiting interactions that *shouldn't* occur!)

    E.g., I effectively allow for the creation of virtual processors of
    specific capabilities and resource allocations AS IF they were discrete
    hardware units interconnected by <something>. This lets me dole out
    the fixed resources (memory, MIPS, time, watts) in the box to specific
    uses and have "extra" for uses that require them.

    (I can set a virtual processor to only have access to 64KB! -- or 16K
    or 16MB -- of memory, only allow it to execute a million opcode fetches
    per second, etc. and effectively have a tiny 8b CPU emulated within a
    much more capable framework. And, not be limited to moving data via
    a serial port to other such processors!)

    Why virtual processors, if real ones are cheap?

    Because you can reconfigure them on the fly, which is harder with real processors.

    --
    Bill Sloman, Sydney

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)