• Linear Address Spaces

    From Ben Collver@21:1/5 to All on Mon Apr 29 01:46:08 2024
    Linear Address Spaces
    =====================
    By Poul-Henning Kamp
    Communications of the ACM, October 2022, Vol. 65 No. 10, Pages 42-44 10.1145/3561991

    I recently bought an Apple computer with the new M1 CPU to supplement
    the beastiarium known as Varnish Cache's continuous integration
    cluster. I am a bit impressed that it goes head-to-head with the
    s390x virtual machine borrowed from IBM, while never drawing more
    than 25 watts, but other than that: Meh...

    This is one disadvantage of being a systems programmer: You see up
    close how each successive generation in an architecture has been
    inflicted with yet another "extension," "accelerator," "cache,"
    "look-aside buffer," or some other kind of "marchitecture," to the
    point where the once-nice and orthogonal architecture is almost
    obscured by the "improvements" that followed. It seems almost like a
    law of nature:

    Any successful computer architecture, under immense pressure to
    "improve" while "remaining 100% compatible," will become a
    complicated mess.

    Show me somebody who calls the IBM S/360 a RISC design, and I will
    show you somebody who works with the s390 instruction set today.

    Or how about: The first rule of ARM is, "We don't talk about
    Thumb-2."

    A very special instance of this law happened when AMD created the
    x86-64 instruction set to keep the x86 architecture alive after
    Intel, the nominal owner of that architecture, had all but abandoned
    it and gone full Second Systems Syndrome with the ill-fated Itanium architecture.

    Fundamentally, we now have both kinds of CPUs--ARM and x64--and they
    both suffer from the same architectural problems. Take, for example,
    the translation from linear virtual to linear physical addresses. Not
    only have page table trees grown to a full handful of levels, but
    there are also multiple page sizes, each of which comes with its own
    handful of footnotes limiting usability and generality.

    Why do we even have linear physical and virtual addresses in the
    first place, when pretty much everything today is object-oriented?

    Linear virtual addresses were made to be backward-compatible with
    tiny computers with linear physical addresses but without virtual
    memory. There are still linear virtual addresses that are
    backward-compatible with computers that were backward-compatible with
    computers that were...

    Apart from the smallest microcontrollers, nobody sane uses linear
    address spaces anymore, neither physical nor virtual. The very first
    thing any real-time nucleus or operating system kernel does is
    implement an abstract object store on top of the linear space. If the
    machine has virtual memory support, it then tries to map from virtual
    to physical as best it can, given five levels of page tables and all
    that drags in with it.

    Translating from linear virtual addresses to linear physical
    addresses is slow and complicated, because 64-bit can address a lot
    of memory.

    Having a single linear map would be prohibitively expensive in terms
    of memory for the map itself, so translations use a truncated tree
    structure, but that adds a whole slew of new possible exceptions:
    What if the page entry for the page directory entry for the page
    entry for the exception handler for missing page entries is itself
    empty?

    This is the land where "double fault exceptions" and "F00F
    workarounds" originate. And with five levels of page tables, in the
    ultimate worst case, it takes five memory accesses before the CPU
    even knows where the data is.

    It doesn't have to be that way.

    One of my volunteer activities for datamuseum.dk is writing a
    software emulation of a unique and obscure computer: the Rational
    R1000/s400. (It's OK; you can look it up. I'll wait, because until
    one stood on our doorstep, I had never heard of it either.)

    The R1000 has many interesting aspects, not the least of which is
    that it was created by some very brilliant and experienced people who
    were not afraid to boldly go. And they sure went: The instruction set
    is Ada primitives, it operates on bit fields of any alignment, and
    the data bus is 128 bits wide: 64-bit for the data and 64-bit for
    data's type. They also made it a four-CPU system, with all CPUs
    operating in the same 64-bit global address space. It also needed a
    good 1,000 amperes at five volts delivered to the backplane through a
    dozen welding cables.

    The global 64-bit address space is not linear; it is an object cache
    addressed with an (object + offset) tuple, and if that page of the
    object is not cached, a microcode trap will bring it in from disk.

    In marketing materials, the object cache was sold as "memory boards,"
    but in hardware, they contained a four-way associative cache, which
    brilliantly hid the tag-RAM lookup during the row address strobe
    (RAS) part of the DRAM memory cycle, so that it is precisely as fast
    as a normal linear DRAM memory would have been.

    State-of-the-art CPUs today can still address only approximately 57
    bits of address space, using five levels of page tables, each level successively and slowly sorting out another 10 bits of the address.

    The R1000 addresses 64 bits of address space instantly in every
    single memory access. And before you tell me this is impossible: The
    computer is in the next room, built with 74xx-TTL
    (transistor-transistor logic) chips in the late 1980s. It worked back
    then, and it still works today.

    The R1000 was a solid commercial success, or I guess I should say
    military success, because most customers were in the defense
    industry, and the price was an eye-watering, $1 million a pop. It
    also came with the first fully semantic IDE, but that is a story for
    another day.

    Given Ada's reputation for strong typing, handling the type
    information in hardware rather than in the compiler may sound
    strange, but there is a catch: Ada's variant data structures can make
    it a pretty daunting task to figure out how big an object is and
    where to find things in it. Handling data + type as a unit in
    hardware makes that fast.

    Why do we even have linear physical and virtual addresses in the
    first place, when pretty much everything today is object-oriented?

    Not that type-checking in hardware is a bad idea. Quite the contrary:
    The recent announcement by ARMa that it has prototyped silicon with
    its Morello implementation of Cambridge University's CHERI
    architecture gives me great hope for better software quality in the
    future.

    Cut to the bone, CHERI makes pointers a different data type than
    integers in hardware and prevents conversion between the two types.
    Under CHERI, new valid pointers can be created only by derivation
    from existing valid pointers, either by restricting the permissible
    range or by restricting the permissions. If you try to create or
    modify a pointer by any other means, it will no longer be a pointer,
    because the hardware will have cleared the special bit in memory that
    marked it as a valid pointer.

    According to Microsoft Research,b CHERI would have deterministically
    detected and prevented a full 43% of the security problems reported
    to the company in 2019. To put that number in perspective: The
    National Highway Traffic Safety Administration reports that 47% of
    the people killed in traffic accidents were not wearing seat belts.

    Like mandatory seat belts, some people argue there would be no need
    for CHERI if everyone "just used type-safe languages," or they will
    claim that the extra bits for "capabilities" CHERI pointers carry
    make their programs look fat.

    I'm not having any of it.

    The linear address space as a concept is unsafe at any speed, and it
    badly needs mandatory CHERI seat belts. But even better would be to
    get rid of linear address spaces entirely and go back to the future,
    as successfully implemented in the Rational R1000 computer 30-plus
    years ago.

    Author
    ======
    Poul-Henning Kamp spent more than a decade as one of the primary
    developers of the FreeBSD operating system before creating the
    Varnish HTTP Cache software, which aroun a fifth of all Web traffic
    goes through at some point. He is an independent contractor; one of
    his most recent projects was a supercomputer cluster to stop the
    stars twinkling in the mirrors of ESO's new ELT (extremely large
    telescope).

    From: <https://web.archive.org/web/20220921202132/ https://cacm.acm.org/magazines/2022/10/
    264852-linear-address-spaces/fulltext>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Ben Collver on Mon Apr 29 02:50:36 2024
    On Mon, 29 Apr 2024 01:46:08 -0000 (UTC), Ben Collver wrote:

    Intel, the nominal owner of that architecture, had all but abandoned it
    and gone full Second Systems Syndrome ...

    Ummm ... iAPX 432, anybody? i860/960 RISC? 80286, even?

    Talk about “superficial analysis” ...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)