Forum: >>> Magnum BBS <<<

Capabilities, Anybody?

From Lawrence D'Oliveiro@21:1/5 to All on Fri Mar 8 22:38:11 2024

“Capabilities” are an old idea for doing memory protection by storing the access rights in unforgeable descriptors that are given to authorized processes. This way, there is no need for the traditional unprivileged- versus-privileged-processor-mode concept; process A can have privileged
access to memory region X but not Y, while process B can have privileged
access to memory region Y but not X, so neither is “more” privileged than the other: each one is trusted with just a limited set of privileged
functions.

The idea fell out of use because of performance issues. But in these more security-conscious times, the overhead seems more and more like a
reasonable price to pay for the greater control it offers. There is a
project called CHERI, whose concepts have been implemented in Arm’s “Morello” chip.

<https://www.theregister.com/2022/07/26/cheri_computer_runs_kde/>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to All on Sat Mar 9 01:51:10 2024

I have been in CPU design for a very long time. I did a HS level
design (calculator) in 1968 3 years before the Bomar Brain, did
a #60 design in college as a Jr, and started doing professional
designs (Mc 88100) in 1983.

With all this background and long term in this career, I can say
without a trace of doubt, that I am not <yet> smart enough to do
a capabilities ISA/system and get it out the door without errors.

On the other hand, My 66000 Architecture is immune to most attack
strategies now in vogue:: Return Oriented Programming, RowHammer,
Spectré, GOT overwrites, Buffer Overflows,... All without having
any semblance of capabilities; and all without any performance
degradations other than typical cache and TLB effects.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Robert Finch on Sat Mar 9 02:25:48 2024

On Fri, 8 Mar 2024 21:15:29 -0500, Robert Finch wrote:

I gather that capabilities are generally fine-grained, and capability pointers would be generated and handed out by the OS. What happens when
a pointer is incremented?

Each capability is a descriptor, describing a range of memory, not (necessarily) just one address. So it is valid to use that to address any
area within the range.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From John Dallman@21:1/5 to D'Oliveiro on Sat Mar 9 10:29:00 2024

In article <usg40i$1udfo$3@dont-email.me>, ldo@nz.invalid (Lawrence
D'Oliveiro) wrote:

There is a project called CHERI, whose concepts have been implemented
in Arm's _Morello_ chip.

I've followed this, a bit, and was offered a Morello development board
last year. I concluded that my employers had too much commercially
important work underway for me to spend six months tinkering with a
prototype port to something that may not pan out. We'd also have to
return the hardware and thus couldn't maintain the port if it succeeded.

John

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to mitchalsup@aol.com on Sat Mar 9 15:09:46 2024

mitchalsup@aol.com (MitchAlsup1) writes:

I have been in CPU design for a very long time. I did a HS level
design (calculator) in 1968 3 years before the Bomar Brain, did
a #60 design in college as a Jr, and started doing professional
designs (Mc 88100) in 1983.

With all this background and long term in this career, I can say
without a trace of doubt, that I am not <yet> smart enough to do
a capabilities ISA/system and get it out the door without errors.

On the other hand, My 66000 Architecture is immune to most attack
strategies now in vogue:: Return Oriented Programming, RowHammer,
Spectré, GOT overwrites, Buffer Overflows,... All without having
any semblance of capabilities; and all without any performance
degradations other than typical cache and TLB effects.

On the gripping hand, the Burroughs Large systems capability based
design is still processing data almost sixty years after the original
B5500 was introduced.

There CHERI designs on silicon in existence

https://www.morello-project.org/

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Sat Mar 9 20:33:49 2024

On Sat, 09 Mar 2024 15:09:46 GMT, Scott Lurndal wrote:

... the Burroughs Large systems capability based design ...

As I recall, that depended on not giving users access to a compiler that
could generate instructions that bypassed the protection system.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Robert Finch on Sat Mar 9 20:34:52 2024

On Sat, 9 Mar 2024 14:58:24 -0500, Robert Finch wrote:

Capabilities sounds like something previously implemented in mainframe
class computers.

IBM’s System/38 and follow-on AS/400 (both long obsolete) may have had something like them. Not sure if they count as “mainframe-class”.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From John Dallman@21:1/5 to D'Oliveiro on Sat Mar 9 21:16:00 2024

In article <usih5b$2gl39$2@dont-email.me>, ldo@nz.invalid (Lawrence
D'Oliveiro) wrote:

IBM_s System/38 and follow-on AS/400 (both long obsolete) may have
had something like them. Not sure if they count as
_mainframe-class_.

This line is still going, IBM i is the latest version. It is a capability system, but the capabilities are implemented in software rather than
hardware. This works because the available interfaces are fairly abstract
and low-level access is simply not allowed.

Nowadays, it runs on IBM POWER hardware, the same as is used for AIX. It
isn't very fast, but it's intended for markets where the IBM branding is important, and basic data processing, rather than anything challenging
for modern hardware, is all that's needed.

It is a fairly large business segment for IBM, but tends to be a bit
segregated from other fields of computing, because its terminology is
weird, and it's so firmly business-orientated.

John

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Sun Mar 10 00:02:32 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Sat, 09 Mar 2024 15:09:46 GMT, Scott Lurndal wrote:

... the Burroughs Large systems capability based design ...

As I recall, that depended on not giving users access to a compiler that >could generate instructions that bypassed the protection system.

Yes. Which isn't a problem (once they fixed the bug in the early
1970's that allowed loading a compiler from a library tape). Fixed
by removing the compiler privilege when loading from tape - the
operator/system administrator would need issue a privileged
command to mark the executable as a compiler.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to BGB on Sat Mar 9 23:59:43 2024

BGB <cr88192@gmail.com> writes:

On 3/9/2024 9:09 AM, Scott Lurndal wrote:

There [are] CHERI designs on silicon in existence

https://www.morello-project.org/

It is doable, at least.

Main open question is if they can deliver enough on their claims in a
way that justifies the cost of the memory tagging (eg, where one needs
to tag whether or not each memory location holds a valid capability).

As I see it, "locking things down" would likely require turning things
like "malloc()/free()", "dlopen()/dlsym()/...", etc, into system calls
(and generally giving the kernel a much more active role in this process).

All of these have been addressed in CHERI.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to BGB on Sun Mar 10 00:04:12 2024

BGB wrote:

On 3/9/2024 1:58 PM, Robert Finch wrote:

On 2024-03-09 1:56 p.m., BGB wrote:

On 3/9/2024 9:09 AM, Scott Lurndal wrote:

mitchalsup@aol.com (MitchAlsup1) writes:

<snip>

For Femtiki OS, I have a single object describing an array of values.
For instance messages which are small objects, are described with a
single object for an array of messages. It is too costly to use an
object descriptor for each message.

For a CHERI like approach, one would need a tag of 1 bit for every 16
bytes of RAM (to flag whether or not that RAM represents a valid
capability).

For the combination of RAM sizes and FPGAs I have access to, this is non-viable, as I would need more BRAM for the tag memory than exists in
the FPGAs.

Yes, indeed, not viable. Now imagine a page of those, and now you have
to write out 4096 bytes and 2048-tag-bits onto a disk with standard sectors......

In effect, this will mean needing another smaller cache which is bolted
onto the L2 cache or similar, whose sole purpose is to provide tag-bits
(and probably bounce requests to some other area of RAM which contains
the tag-bits memory).

Denelcor HEP had tag-like-bits and all the crud they bring (but they were
used as locks instead of tags).

As I see it, "locking things down" would likely require turning things
like "malloc()/free()", "dlopen()/dlsym()/...", etc, into system calls
(and generally giving the kernel a much more active role in this
process).

I think this may not be necessary, but I have to read some more. The
capabilities have transfer rules which might make it possible to use
existing code. They have ported things over to Riscv. It cannot be too
mountainous a task.

You can make it work, yes, but the question is less "can you make it
work, technically", but more:
Can you make it work in a way that provides both a fairly normal C experience, and *also* an unbreakable sandbox, at the same time.

And here the answer is essentially <wait for it> no.

My skepticism here is that, short of drastic measures like moving malloc
and libdl and similar into kernel space, it may not be possible to keep
the sandbox secure using solely capabilities.

ASLR could help, but using ASLR to maintain an image of integrity for
the capability system would be "kinda weak".

How do you ALSR code when a latent capability on disk still points at
its defined memory area ? Yes, you can ALSR at boot, but you can use
the file system to hold capabilities {which is something most capability systems desire and promote.}

One could ask though:
How is my security model (with Keyrings) any different?

Well, the partial answer mostly is that a call that switches keyrings is effectively accomplished via context switches (with the two keyrings effectively running in separate threads).

So, like, even if the untrusted thread has a pointer to the protected thread's memory, it can't access it...

Though, a similar model could potentially be accomplished with
conventional page-tables, by making pseudo-processes which only share
parts of their address space with another process (and the protected
memory is located in the non-shared spaces, with any calls between them
via an RPC mechanism).

Capability manipulation via messages.

Had considered mechanisms which could pull this off without a context
switch, but most would fall short of "acceptably secure" (if a path
exists where a task could modify its own KRR or similar, this mechanism
is blown).

My bounds-checking scheme also worked, but with a caveat:
It only works if code does not get "overly clever" with the use of pointers.

Which no-one can trust of C programs.

So, it worked well enough to where I was able to enable full
bounds-checking in Doom and similar, but was not entirely transparent to
some of the runtime code. If you cast between pointers and integers, and manipulate the pointer bits, there are "gotchas".

Gee, if only we had trained programmers to avoid some of the things we
are now requiring new languages to prevent.....

A full capability system is going to have a similar restriction.

Understatement of the year candidate !

Either pointer<->integer casting would need to be disallowed, or (more likely), turned into a runtime call which can "bless" the address before returning it as a capability, which would exist as another potential
attack surface (unless, of course, this mechanism is itself turned into
a system call).

OTOH:
If one can't implement something like a conventional JavaScript VM, or
if it takes a significant performance hit, this would not be ideal.

Going for 2 in one post !!

Or, change the description, as being mostly a tool to eliminate things
like buffer overflow exploits and memory corruption, and as a fairly
powerful debugging feature.

But, say, note that it would not be sufficient, say, for things like
sandboxing hostile code within a shared address space with another
program that needs to be kept protected.

Granted, the strength could likely be improved (in the face of trying
to prevent hostile code from being able to steal capabilities) through
creative use of ASLR. Along with ABI features, such as "scratch
register scrubbing" (say, loading zeroes into scratch registers on
function return, such as to prevent capabilities from being leaked
through via registers), marking function pointers as "Execute Only" etc. >>>
As noted, a capability system would likely still be pretty strong
against things like buffer overflows (but if only being used to
mitigate buffer overflows, is a bit overkill; so the main
"interesting" case is if it can be used to make an "unbreakable
sandbox" for potentially hostile machine code).

*: If it is possible to perform a Load or (worse, Capability Load)
through a function pointer, this is likely to be a significant attack
vector. Need to make it so that function pointers can only be used to
call things. Protecting against normal data loads would be needed
mostly to try to prevent code from being able to gain access to a
known pointer and possibly side-step the ASLR (say, if it can figure
out that the address it wants to access is reachable from a capability
that the code has access to).

Though, on my side of things, it is possible I could revive a modified
form of the 128-bit ABI, while dropping the VAS back down to 48 bits,
and turn it into a more CHERI-like form (with explicit upper and lower
bounds and access-enable flags, rather than a shared-exponent size and
bias scheme).

Yeah, IMO explicit upper and lower bounds would be better even though it
uses more memory. The whole manipulation of the bounds is complex. I
sketched out using a 256b capability descriptor. Some of the bits can be
trimmed from the bounds if things are page aligned.

IIRC, they were using 128-bit descriptors with a bit-slicing scheme.

So, say, if I were to do similar (within my existing pointer layout):
( 27: 0): Base Address
( 47: 28): Shared Address (47:28)
( 63: 48): Type Tag Bits
( 87: 64): Lower Bound (27:4)
(111: 88): Upper Bound (27:4)
( 112): Base Adjust
( 113): Lower Bound Adjust
( 114): Upper Bound Adjust
(127:115): Access Flags / Etc

Why is lower bound NOT 0 ?
How can you assume/work-with a base address smaller than 48-bits ??
How can the bounds not be at least 47-bits in size ??

Though, this particular encoding would limit bounds-checking to a 256MB region, which is lame (or eat more tag bits, and have slightly bigger regions).

It is worse than lame.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Theo Markettos@21:1/5 to BGB on Sun Mar 10 15:42:11 2024

BGB <cr88192@gmail.com> wrote:

There CHERI designs on silicon in existence

https://www.morello-project.org/

Member of the CHERI team here...

Main open question is if they can deliver enough on their claims in a
way that justifies the cost of the memory tagging (eg, where one needs
to tag whether or not each memory location holds a valid capability).

Crudely speaking, it's 1/128 of memory so 0.78%. Through use of
a tag cache you don't need to have a speciifc 129th bit on your DRAM
(although that's an option, like ECC) but can just wall off a piece of
regular DRAM and use it for a tag table that's accessed via the tag cache.
You only pay the cost for DRAM you actually have (eg one tag table per DIMM, sized for that DIMM).

The tag cache can also be compressed[1] so that if a memory page has no capabilities you don't need to store tags for it. This reduces tag cache bandwidth further.

As I see it, "locking things down" would likely require turning things
like "malloc()/free()", "dlopen()/dlsym()/...", etc, into system calls
(and generally giving the kernel a much more active role in this process).

You don't need memory allocations to be system calls, because capabilities
can be manipulated in userspace (which is a key design goal). ie if
malloc() possesses a capability to 1MB of memory and a client requests 100 bytes, it can take its 1MB capability, offset the base to point to the start
of the allocation, set the offset to zero, and shrink the top to be
base+100, and then return that 100-byte capability.

In a Unix OS, userspace malloc() does occasionally need to make a syscall - when it runs out of memory pages on hand it has to ask the kernel to
allocate it some more via mmap(). But this is orthogonal to use of capabilities (which are virtually addressed in such a system).

For dynamic linking, you need something that holds capabilities that allow
you to turn read-write memory into read-execute memory. That can be part of the dynamic linker or part of the OS. Since you are typically making
syscalls to read the shared library from disc anyway, as well as marking RW pages as RX pages in the page table, the OS is already involved.

But, say, note that it would not be sufficient, say, for things like sandboxing hostile code within a shared address space with another
program that needs to be kept protected.

Granted, the strength could likely be improved (in the face of trying to prevent hostile code from being able to steal capabilities) through
creative use of ASLR. Along with ABI features, such as "scratch register scrubbing" (say, loading zeroes into scratch registers on function
return, such as to prevent capabilities from being leaked through via registers), marking function pointers as "Execute Only" etc.

As noted, a capability system would likely still be pretty strong
against things like buffer overflows (but if only being used to mitigate buffer overflows, is a bit overkill; so the main "interesting" case is
if it can be used to make an "unbreakable sandbox" for potentially
hostile machine code).

Code capabilities prevent a lot of control flow attacks, because you can
only execute code you have capabilities to. For example you're in a
function - you possess a return address capability (which has bounds based
on your local environment or compartment you're in) so you can manipulate
the address of that return address, but you can't jump to arbitrary code and you can't forge return addresses. So no stack smashing, no ROP/JOP attacks, etc.

Through setting the bounds on a code capability you can sandbox small pieces
of code to a function granularity, which is more efficient than typical MMU sandboxing.

*: If it is possible to perform a Load or (worse, Capability Load)
through a function pointer, this is likely to be a significant attack
vector. Need to make it so that function pointers can only be used to
call things. Protecting against normal data loads would be needed mostly
to try to prevent code from being able to gain access to a known pointer
and possibly side-step the ASLR (say, if it can figure out that the
address it wants to access is reachable from a capability that the code
has access to).

You can't arbitrarily change data to code capabilities - they are different types.

It is likely that the capability memory tagging would need to be managed
by the L2 cache. Would need some mechanism for the tag-bits memory (say,
2MB for 256MB at 1b per 16B line). Would also need to somehow work this
flag bit into the ringbus messaging.

AXI has user fields which can be used for sending capabilities across the interconnect, through third party AXI IPs like muxes, arbiters, etc.

Theo

[1] https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/201711-iccd2017-efficient-tags.pdf

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From John Dallman@21:1/5 to Theo Markettos on Sun Mar 10 17:14:00 2024

In article <Qry*XE3Ez@news.chiark.greenend.org.uk>, theom+news@chiark.greenend.org.uk (Theo Markettos) wrote:

The C experience is fairly normal, as long as you are actually
playing by the C rules. You can't arbitraily cast integers to
pointers - if you plan to do that you need to use intptr_t so
the compiler knows to keep the data in a capability so it can
use it as a pointer later.

Makes sense, though it will require updating of older code for the rules
being more thoroughly enforced. Not a bad thing.

Tricks which store data in the upper or lower bits of pointers are
awkward.

Not compatible with Aarch64 Pointer Authentication, but CHERI should be a functional replacement anyway.

Changes in a 6M LoC KDE desktop codebase were 0.026% of lines: https://www.capabilitieslimited.co.uk/_files/ugd/f4d681_e0f23245dace 466297f20a0dbd22d371.pdf

15,000 or so changes. Quite a lot. Is the code backwards-compatible to a conventional C platform?

Sandboxing involves dividing code into compartments; that involves
some decision making as to where you draw the security boundaries.
There aren't good tools to do that (they are being worked on).
CHERI offers you the tools to implement whatever compartmentalisation
stategy you wish, but it's not quite as simple as just recompiling.

I have a slightly odd case: the software I work on ships as a great big
shared library that's used in-process by its caller. It isn't any kind of server, and doesn't use any IPC; in concept it's a huge math library that
asks the caller to allocate memory for it. So it needs to share a heap
with the caller. Presumably that model is workable?

... we're running on FreeBSD

That was a point against my experimenting with Morello when we were
offered it last year; the requirement to port to FreeBSD first. Morello
Linux seems insufficiently mature at present; do you have any idea of the timescale for it to be robustly usable for porting application code by
someone who isn't experienced in Linux internals?

John

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Theo Markettos@21:1/5 to mitchalsup@aol.com on Sun Mar 10 16:24:38 2024

MitchAlsup1 <mitchalsup@aol.com> wrote:

BGB wrote:

On 3/9/2024 1:58 PM, Robert Finch wrote:

On 2024-03-09 1:56 p.m., BGB wrote:

On 3/9/2024 9:09 AM, Scott Lurndal wrote:

mitchalsup@aol.com (MitchAlsup1) writes:

<snip>

For Femtiki OS, I have a single object describing an array of values.
For instance messages which are small objects, are described with a
single object for an array of messages. It is too costly to use an
object descriptor for each message.

For a CHERI like approach, one would need a tag of 1 bit for every 16
bytes of RAM (to flag whether or not that RAM represents a valid capability).

For the combination of RAM sizes and FPGAs I have access to, this is non-viable, as I would need more BRAM for the tag memory than exists in
the FPGAs.

If you have ECC RAM on your FPGA board you could use the ECC bits for tags. Otherwise a tag cache is another way. The L1s and L2s carry tags (ie 129
bit datapath), but you just put the tag cache on the front of your DRAM.

Yes, indeed, not viable. Now imagine a page of those, and now you have
to write out 4096 bytes and 2048-tag-bits onto a disk with standard sectors......

Our swapping implementation keeps the tag bits in RAM, while the page is swapped out. Eventually you need to swap out a page of tag bits, but that's much less common.

In effect, this will mean needing another smaller cache which is bolted onto the L2 cache or similar, whose sole purpose is to provide tag-bits (and probably bounce requests to some other area of RAM which contains
the tag-bits memory).

Denelcor HEP had tag-like-bits and all the crud they bring (but they were used as locks instead of tags).

As I see it, "locking things down" would likely require turning things >>> like "malloc()/free()", "dlopen()/dlsym()/...", etc, into system calls >>> (and generally giving the kernel a much more active role in this
process).

I think this may not be necessary, but I have to read some more. The
capabilities have transfer rules which might make it possible to use
existing code. They have ported things over to Riscv. It cannot be too
mountainous a task.

You can make it work, yes, but the question is less "can you make it
work, technically", but more:
Can you make it work in a way that provides both a fairly normal C experience, and *also* an unbreakable sandbox, at the same time.

The C experience is fairly normal, as long as you are actually playing by
the C rules. You can't arbitraily cast integers to pointers - if you plan
to do that you need to use intptr_t so the compiler knows to keep the data
in a capability so it can use it as a pointer later.

Tricks which store data in the upper or lower bits of pointers are awkward. Other tricks like XOR linked lists of pointers don't work. This is all
stuff that's pushing into the 'undefined behaviour' parts of C (even if C doesn't explicitly call it out).

Changes in a 6M LoC KDE desktop codebase were 0.026% of lines: https://www.capabilitieslimited.co.uk/_files/ugd/f4d681_e0f23245dace466297f20a0dbd22d371.pdf

Depends what you mean by 'unbreakable sandbox': this is compiling code with every pointer being a capability, so every memory access is bounds checked.

Sandboxing involves dividing code into compartments; that involves some decision making as to where you draw the security boundaries. There aren't good tools to do that (they are being worked on). CHERI offers you the
tools to implement whatever compartmentalisation stategy you wish, but it's
not quite as simple as just recompiling.

And here the answer is essentially <wait for it> no.

My skepticism here is that, short of drastic measures like moving malloc and libdl and similar into kernel space, it may not be possible to keep
the sandbox secure using solely capabilities.

ASLR could help, but using ASLR to maintain an image of integrity for
the capability system would be "kinda weak".

How do you ALSR code when a latent capability on disk still points at
its defined memory area ? Yes, you can ALSR at boot, but you can use
the file system to hold capabilities {which is something most capability systems desire and promote.}

Why would you want to ASLR? ASLR is to prevent you guessing valid addresses for things so you can't craft pointers to them. CHERI prevents you crafting pointers to arbitrary things in the first place.

One could ask though:
How is my security model (with Keyrings) any different?

Well, the partial answer mostly is that a call that switches keyrings is effectively accomplished via context switches (with the two keyrings effectively running in separate threads).

So, like, even if the untrusted thread has a pointer to the protected thread's memory, it can't access it...

Though, a similar model could potentially be accomplished with
conventional page-tables, by making pseudo-processes which only share
parts of their address space with another process (and the protected
memory is located in the non-shared spaces, with any calls between them
via an RPC mechanism).

Capability manipulation via messages.

That's the microkernel setup: the software running in system mode holds
the privilege to alter access control (via page tables), so any time you
want to change that you have to ask the system (microkernel or whatever) to do so. That's slow, in particular TLB manipulation (invalidation and
shootdowns). CHERI allows you to manipulate them in userspace without
having to call out to the kernel. Additionally it is finer grained than page granularity.

Some experimental OSes have done things with manipulating page tables from userspace processes which avoids syscall overhead but not TLB costs - and it probably depends on the architecture whether you can do TLB invalidations
from userspace.

Had considered mechanisms which could pull this off without a context switch, but most would fall short of "acceptably secure" (if a path
exists where a task could modify its own KRR or similar, this mechanism
is blown).

My bounds-checking scheme also worked, but with a caveat:
It only works if code does not get "overly clever" with the use of pointers.

Which no-one can trust of C programs.

A lot of modern software is well behaved (see figure above). Particular software like JIT compilers can be more awkward - ideally you would really want the JIT compiler to emit capability-aware code. You can still run generated aarch64/rv64 non-capability code, but without the benefit of capability
checks.

So, it worked well enough to where I was able to enable full bounds-checking in Doom and similar, but was not entirely transparent to some of the runtime code. If you cast between pointers and integers, and manipulate the pointer bits, there are "gotchas".

That's the kind of thing that fall down: software being 'clever', where it doesn't need to be. I get the sense Doom's primary purpose in life was
being 'clever' in order to be fast on a 386.

Gee, if only we had trained programmers to avoid some of the things we
are now requiring new languages to prevent.....

If only we could rewrite all the software out there in memory-safe
languages... then we'd have twice as much software (and more bugs).

Either pointer<->integer casting would need to be disallowed, or (more likely), turned into a runtime call which can "bless" the address before returning it as a capability, which would exist as another potential
attack surface (unless, of course, this mechanism is itself turned into
a system call).

OTOH:
If one can't implement something like a conventional JavaScript VM, or
if it takes a significant performance hit, this would not be ideal.

Going for 2 in one post !!

We've had the DukTape Javascript interpreter working for CHERI for a while. Work is under way to port Chromium and V8 - that's a much bigger project,
just because Chromium is a huge piece of software (and we're running on FreeBSD, which is not a platform that Chrome supports building for). The
work in V8 is to get it to implement the JS object model using CHERI instructions as part of its generated code.

Though, on my side of things, it is possible I could revive a modified >>> form of the 128-bit ABI, while dropping the VAS back down to 48 bits,
and turn it into a more CHERI-like form (with explicit upper and lower >>> bounds and access-enable flags, rather than a shared-exponent size and >>> bias scheme).

Yeah, IMO explicit upper and lower bounds would be better even though it >> uses more memory. The whole manipulation of the bounds is complex. I
sketched out using a 256b capability descriptor. Some of the bits can be >> trimmed from the bounds if things are page aligned.

We originally started out with a 256-bit capability with explicit base and
top - this was to try things out simply so as not to prematurely optimise.
One early finding was that we needed to support capabilities being out of bounds, as long as they aren't dereferenced out of bounds - software
sometimes saves a pointer that's before or after the object, before then bringing it back in bounds when dereferencing it.

This is something the 128-bit compressed capability format supports, which compresses the bounds a bit like floating point. This imposes certain
limits on bounds granularity, but they haven't been a problem in practice - memory allocators tend to allocate objects in aligned chunks anyway (eg ask
for a 128MiB block and it'll probably be page aligned). The pointer is always byte aligned.

Theo

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Theo Markettos on Sun Mar 10 21:23:09 2024

Theo Markettos wrote:

MitchAlsup1 <mitchalsup@aol.com> wrote:

BGB wrote:
<snip>

You can make it work, yes, but the question is less "can you make it
work, technically", but more:
Can you make it work in a way that provides both a fairly normal C
experience, and *also* an unbreakable sandbox, at the same time.

The C experience is fairly normal, as long as you are actually playing by
the C rules. You can't arbitraily cast integers to pointers - if you plan
to do that you need to use intptr_t so the compiler knows to keep the data
in a capability so it can use it as a pointer later.

As a 'for instance' how does one take a capability and align it to a cache
line boundary ?? Say in/after malloc() ?!?

Tricks which store data in the upper or lower bits of pointers are awkward.

Especially so when you have a 64-bit VaS to play in.

Other tricks like XOR linked lists of pointers don't work.

This should have died out with the PDP-11s. With modern machines it does not save enough space to warrant the loss in performance.

This is all
stuff that's pushing into the 'undefined behaviour' parts of C (even if C doesn't explicitly call it out).

<snip>

Why would you want to ASLR? ASLR is to prevent you guessing valid addresses for things so you can't craft pointers to them. CHERI prevents you crafting pointers to arbitrary things in the first place.

ALSR has become a catch-phrase used to give the listener a good feeling
about the security of the present system--all the while knowing that is
it little more than window dressing on a building already in flames.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Theo@21:1/5 to John Dallman on Sun Mar 10 22:06:36 2024

John Dallman <jgd@cix.co.uk> wrote:

In article <Qry*XE3Ez@news.chiark.greenend.org.uk>, theom+news@chiark.greenend.org.uk (Theo Markettos) wrote:

The C experience is fairly normal, as long as you are actually
playing by the C rules. You can't arbitraily cast integers to
pointers - if you plan to do that you need to use intptr_t so
the compiler knows to keep the data in a capability so it can
use it as a pointer later.

Makes sense, though it will require updating of older code for the rules being more thoroughly enforced. Not a bad thing.

Indeed.

Tricks which store data in the upper or lower bits of pointers are
awkward.

Not compatible with Aarch64 Pointer Authentication, but CHERI should be a functional replacement anyway.

I think they can theoretically coexist - ie you can have an authenticated 64 bit pointer which looks like an integer as far as the capability checks are concerned, but if you don't try to dereference it as a capability then
that's ok. (Morello doesn't implement PA as it's based on a prior microarchitecture)

Changes in a 6M LoC KDE desktop codebase were 0.026% of lines: https://www.capabilitieslimited.co.uk/_files/ugd/f4d681_e0f23245dace 466297f20a0dbd22d371.pdf

15,000 or so changes. Quite a lot. Is the code backwards-compatible to a conventional C platform?

Should be - it's mostly making things play by the rules. Once they play by
the rules then it means they will work the same (or less buggily) on a
regular C platform.

The above link describes the changes - a number being replacing 'long' with intptr_t, some undefined behaviour, bad use of realloc(). Some of it was modernisation of old codebases (eg add C11 atomics), other changes were just
to make optional code that isn't currently available (eg no OpenGL available
in VNC).

Sandboxing involves dividing code into compartments; that involves
some decision making as to where you draw the security boundaries.
There aren't good tools to do that (they are being worked on).
CHERI offers you the tools to implement whatever compartmentalisation stategy you wish, but it's not quite as simple as just recompiling.

I have a slightly odd case: the software I work on ships as a great big shared library that's used in-process by its caller. It isn't any kind of server, and doesn't use any IPC; in concept it's a huge math library that asks the caller to allocate memory for it. So it needs to share a heap
with the caller. Presumably that model is workable?

Do you want to compartmentalise that shared library, ie put in trust
boundaries between the library and its caller?

If you just want to run the shared library as-is, you can recompile it and
get bounds checking etc. If you want to have some kind of trust boundary
(eg the library doesn't trust the app, or the app doesn't trust the library) then you would need to put in a compartment boundary between the two. In
that case it might make sense for the memory allocator to be its own compartment; the capabilities it hands out should be usable by both app and library.

... we're running on FreeBSD

That was a point against my experimenting with Morello when we were
offered it last year; the requirement to port to FreeBSD first. Morello
Linux seems insufficiently mature at present; do you have any idea of the timescale for it to be robustly usable for porting application code by someone who isn't experienced in Linux internals?

FreeBSD is more advanced, in part because it's had more development effort
on it over the years and partly since it's less of a moving target (Linux
has huge amounts of churn). That means more things have capability support
in the kernel and userspace (including a lot of packages available).

I believe Morello Linux is able to support console-mode apps - ie it has support context switching and use of capabilities in userspace, with some support in glibc. I believe there is now a dynamic linker, but not sure of
the status. There is only limited use of capabilities in the kernel, so anything to do with kernel compartmentalisation would be more work. I think someone was working on building Debian packages pure-capability - I haven't heard the current status of that work.

https://www.morello-project.org/cheri-feature-matrix/
has a comparison table.

Theo

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Theo Markettos@21:1/5 to mitchalsup@aol.com on Sun Mar 10 22:32:46 2024

MitchAlsup1 <mitchalsup@aol.com> wrote:

Theo Markettos wrote:

MitchAlsup1 <mitchalsup@aol.com> wrote:

BGB wrote:
<snip>

You can make it work, yes, but the question is less "can you make it
work, technically", but more:
Can you make it work in a way that provides both a fairly normal C
experience, and *also* an unbreakable sandbox, at the same time.

The C experience is fairly normal, as long as you are actually playing by the C rules. You can't arbitraily cast integers to pointers - if you plan to do that you need to use intptr_t so the compiler knows to keep the data in a capability so it can use it as a pointer later.

As a 'for instance' how does one take a capability and align it to a cache line boundary ?? Say in/after malloc() ?!?

I'm not sure what you mean:

Capabilities are 128-bit fields stored aligned in memory. It's not allowed
to store a capability that isn't 128-bit aligned. Those naturally align
with cache lines. Every 128 bits has a tag associated with it, stored
together or apart (various schemes discussed in my previous posts).

The memory it points to can be arbitraily aligned. It is just a 64-bit pointer. You dereference it using 8/16/32/64/128 bit load and store instructions in the usual datapath (either explicitly using 'C load'/'C
store' instructions or switching to a mode where every regular load/store implicitly dereferences a capability rather than integer pointer)

The bounds have a certain representation limits, because they're packing
192+ bits of information into a 128 bit space. This boils down to an
alignment granularity: eg if you allocate a (1MiB+1) byte buffer the bounds might be 1MiB+64 (or whatever, I can't remember what the rounding is at this size). malloc() should ensure it doesn't hand out that memory to somebody else; allocators typically do this anyway since they use slab allocators
which round up the allocation to a certain number of slabs.

There is a trickiness if somebody wants to generate a capability to a
subobject in the middle of a large object that isn't aligned: load in a
4.7GiB DVD wholesale into memory and try to generate a capability to a block
of frames in the middle of it, which is potentially large and yet the base
is unaligned, which would cause a loss of bounds precision (somebody could access the frame before or after). It's possible to imagine things like
that, but we've not seen software actually do it.

I'm not sure how any of these relate to cache lines? Aside for ensuring the caches store capabilities atomically and preserve tags, any time you dereference them they work just like regular memory accesses.

If you mean you ask malloc for something you later want to align to a cache line, you ask for something larger increment the pointer to be cache
aligned, in the normal way:

#include <cheriintrin.h>
...
// 64 byte cache lines
ptr = malloc(size+63); // leave extra space for rounding up
offset = ptr & 0x3F;
ptr += (0x40 - offset); // round up to cache line

and then increment the base bound to match the new position of 'ptr' and set the top to be ptr+size:

ptr = cheri_bounds_set(ptr, size);

Theo

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Theo Markettos on Sun Mar 10 22:59:52 2024

Theo Markettos wrote:

MitchAlsup1 <mitchalsup@aol.com> wrote:

Theo Markettos wrote:

MitchAlsup1 <mitchalsup@aol.com> wrote:

BGB wrote:
<snip>

You can make it work, yes, but the question is less "can you make it
work, technically", but more:
Can you make it work in a way that provides both a fairly normal C
experience, and *also* an unbreakable sandbox, at the same time.

The C experience is fairly normal, as long as you are actually playing by >> > the C rules. You can't arbitraily cast integers to pointers - if you plan >> > to do that you need to use intptr_t so the compiler knows to keep the data >> > in a capability so it can use it as a pointer later.

As a 'for instance' how does one take a capability and align it to a cache >> line boundary ?? Say in/after malloc() ?!?

I'm not sure what you mean:

Capabilities are 128-bit fields stored aligned in memory. It's not allowed to store a capability that isn't 128-bit aligned. Those naturally align
with cache lines. Every 128 bits has a tag associated with it, stored together or apart (various schemes discussed in my previous posts).

The memory it points to can be arbitraily aligned.

For performance reasons one would want it* cache line aligned.
(*) or some part of the whole thing aligned to a cache line boundary.

p = (p_type *)((intpt_t)p & ~63);

It is just a 64-bit pointer. You dereference it using 8/16/32/64/128 bit load and store instructions in the usual datapath (either explicitly using 'C load'/'C store' instructions or switching to a mode where every regular load/store implicitly dereferences a capability rather than integer pointer)

The bounds have a certain representation limits, because they're packing
192+ bits of information into a 128 bit space. This boils down to an alignment granularity: eg if you allocate a (1MiB+1) byte buffer the bounds might be 1MiB+64 (or whatever, I can't remember what the rounding is at this size). malloc() should ensure it doesn't hand out that memory to somebody else; allocators typically do this anyway since they use slab allocators which round up the allocation to a certain number of slabs.

So how to you "encode" a petaByte array ?? of megaByte structs in a capability ??

There is a trickiness if somebody wants to generate a capability to a subobject in the middle of a large object that isn't aligned: load in a 4.7GiB DVD wholesale into memory and try to generate a capability to a block of frames in the middle of it, which is potentially large and yet the base
is unaligned, which would cause a loss of bounds precision (somebody could access the frame before or after). It's possible to imagine things like that, but we've not seen software actually do it.

I'm not sure how any of these relate to cache lines?

Smaller agglomerations of memory want to be cache-line aligned for performance reasons. If a struct fits in 1 cache line you don't want it positioned so it needs 2 cache lines in use.

Aside for ensuring the caches store capabilities atomically and preserve tags, any time you dereference them they work just like regular memory accesses.

If you mean you ask malloc for something you later want to align to a cache line, you ask for something larger increment the pointer to be cache
aligned, in the normal way:

#include <cheriintrin.h>
....
// 64 byte cache lines
ptr = malloc(size+63); // leave extra space for rounding up
offset = ptr & 0x3F;
ptr += (0x40 - offset); // round up to cache line

and then increment the base bound to match the new position of 'ptr' and set the top to be ptr+size:

ptr = cheri_bounds_set(ptr, size);

Theo

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Theo Markettos@21:1/5 to mitchalsup@aol.com on Mon Mar 11 11:10:15 2024

MitchAlsup1 <mitchalsup@aol.com> wrote:

Theo Markettos wrote:

The bounds have a certain representation limits, because they're packing 192+ bits of information into a 128 bit space. This boils down to an alignment granularity: eg if you allocate a (1MiB+1) byte buffer the bounds might be 1MiB+64 (or whatever, I can't remember what the rounding is at this
size). malloc() should ensure it doesn't hand out that memory to somebody else; allocators typically do this anyway since they use slab allocators which round up the allocation to a certain number of slabs.

So how to you "encode" a petaByte array ?? of megaByte structs in a capability ??

You create a capability with petabyte-scale bounds. The precision of the bounds may be limited, which means that you can't ram something else right
up against the end or beginning of the array if they aren't sufficiently aligned. This is in practice not a problem: slab allocators will round up
your address before they allocate the next thing, and most OSes won't
populate the rounded up space with pages anyway.

When you take a pointer to an array element, then it has megabyte scale
bounds and they can be represented with more precision. If your struct elements are of an arbitrary size and packed together at the byte level then you either have to live with the bounds giving rights to slightly more than
a single struct element, or you decide that is unacceptable and pad the
struct size up to the next representable size (just like regular non-packed structs enforce certain alignment), and pay a small memory overhead for
that (<0.25%). That's a security decision you can make one way or another.

Theo

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From George Neuner@21:1/5 to theom+news@chiark.greenend.org.uk on Mon Mar 11 09:48:30 2024

On 11 Mar 2024 11:10:15 +0000 (GMT), Theo Markettos <theom+news@chiark.greenend.org.uk> wrote:

MitchAlsup1 <mitchalsup@aol.com> wrote:

Theo Markettos wrote:

The bounds have a certain representation limits, because they're packing >> > 192+ bits of information into a 128 bit space. This boils down to an
alignment granularity: eg if you allocate a (1MiB+1) byte buffer the bounds
might be 1MiB+64 (or whatever, I can't remember what the rounding is at this
size). malloc() should ensure it doesn't hand out that memory to somebody >> > else; allocators typically do this anyway since they use slab allocators >> > which round up the allocation to a certain number of slabs.

So how to you "encode" a petaByte array ?? of megaByte structs in a capability ??

You create a capability with petabyte-scale bounds. The precision of the >bounds may be limited, which means that you can't ram something else right
up against the end or beginning of the array if they aren't sufficiently >aligned. This is in practice not a problem: slab allocators will round up >your address before they allocate the next thing, and most OSes won't >populate the rounded up space with pages anyway.

By default Windows will populate allocated space. You have to
explicitly use the virtual memory api to avoid it. 8-(

When you take a pointer to an array element, then it has megabyte scale >bounds and they can be represented with more precision. If your struct >elements are of an arbitrary size and packed together at the byte level then >you either have to live with the bounds giving rights to slightly more than
a single struct element, or you decide that is unacceptable and pad the >struct size up to the next representable size (just like regular non-packed >structs enforce certain alignment), and pay a small memory overhead for
that (<0.25%). That's a security decision you can make one way or another.

Theo

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Theo Markettos on Mon Mar 11 16:10:09 2024

On 11 Mar 2024 11:10:15 +0000 (GMT)
Theo Markettos <theom+news@chiark.greenend.org.uk> wrote:

MitchAlsup1 <mitchalsup@aol.com> wrote:

Theo Markettos wrote:

The bounds have a certain representation limits, because they're
packing 192+ bits of information into a 128 bit space. This
boils down to an alignment granularity: eg if you allocate a
(1MiB+1) byte buffer the bounds might be 1MiB+64 (or whatever, I
can't remember what the rounding is at this size). malloc()
should ensure it doesn't hand out that memory to somebody else; allocators typically do this anyway since they use slab
allocators which round up the allocation to a certain number of
slabs.

So how to you "encode" a petaByte array ?? of megaByte structs in a capability ??

You create a capability with petabyte-scale bounds. The precision of
the bounds may be limited, which means that you can't ram something
else right up against the end or beginning of the array if they
aren't sufficiently aligned. This is in practice not a problem: slab allocators will round up your address before they allocate the next
thing, and most OSes won't populate the rounded up space with pages
anyway.

When you take a pointer to an array element, then it has megabyte
scale bounds and they can be represented with more precision. If
your struct elements are of an arbitrary size and packed together at
the byte level then you either have to live with the bounds giving
rights to slightly more than a single struct element, or you decide
that is unacceptable and pad the struct size up to the next
representable size (just like regular non-packed structs enforce
certain alignment), and pay a small memory overhead for that
(<0.25%). That's a security decision you can make one way or another.

Theo

Your time stamp (most likely +0000 part) confuses my Claws
Mail newsreader. I wonder if others see similar problem.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Michael S on Mon Mar 11 16:13:48 2024

On Mon, 11 Mar 2024 16:10:09 +0200
Michael S <already5chosen@yahoo.com> wrote:

On 11 Mar 2024 11:10:15 +0000 (GMT)
Theo Markettos <theom+news@chiark.greenend.org.uk> wrote:

MitchAlsup1 <mitchalsup@aol.com> wrote:

Theo Markettos wrote:

The bounds have a certain representation limits, because they're packing 192+ bits of information into a 128 bit space. This
boils down to an alignment granularity: eg if you allocate a
(1MiB+1) byte buffer the bounds might be 1MiB+64 (or whatever, I
can't remember what the rounding is at this size). malloc()
should ensure it doesn't hand out that memory to somebody else; allocators typically do this anyway since they use slab
allocators which round up the allocation to a certain number of
slabs.

So how to you "encode" a petaByte array ?? of megaByte structs in
a capability ??

You create a capability with petabyte-scale bounds. The precision
of the bounds may be limited, which means that you can't ram
something else right up against the end or beginning of the array
if they aren't sufficiently aligned. This is in practice not a
problem: slab allocators will round up your address before they
allocate the next thing, and most OSes won't populate the rounded
up space with pages anyway.

When you take a pointer to an array element, then it has megabyte
scale bounds and they can be represented with more precision. If
your struct elements are of an arbitrary size and packed together at
the byte level then you either have to live with the bounds giving
rights to slightly more than a single struct element, or you decide
that is unacceptable and pad the struct size up to the next
representable size (just like regular non-packed structs enforce
certain alignment), and pay a small memory overhead for that
(<0.25%). That's a security decision you can make one way or
another.

Theo

Your time stamp (most likely +0000 part) confuses my Claws
Mail newsreader. I wonder if others see similar problem.

After further examination, it's unlikely that +0000 is a confusing
part. More likely my newsreader does not understand (GMT)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Michael S on Mon Mar 11 14:50:03 2024

Michael S <already5chosen@yahoo.com> writes:

On 11 Mar 2024 11:10:15 +0000 (GMT)
Theo Markettos <theom+news@chiark.greenend.org.uk> wrote:

MitchAlsup1 <mitchalsup@aol.com> wrote:

Theo Markettos wrote:

The bounds have a certain representation limits, because they're
packing 192+ bits of information into a 128 bit space. This
boils down to an alignment granularity: eg if you allocate a
(1MiB+1) byte buffer the bounds might be 1MiB+64 (or whatever, I
can't remember what the rounding is at this size). malloc()
should ensure it doesn't hand out that memory to somebody else;
allocators typically do this anyway since they use slab
allocators which round up the allocation to a certain number of
slabs.

So how to you "encode" a petaByte array ?? of megaByte structs in a
capability ??

You create a capability with petabyte-scale bounds. The precision of
the bounds may be limited, which means that you can't ram something
else right up against the end or beginning of the array if they
aren't sufficiently aligned. This is in practice not a problem: slab
allocators will round up your address before they allocate the next
thing, and most OSes won't populate the rounded up space with pages
anyway.

When you take a pointer to an array element, then it has megabyte
scale bounds and they can be represented with more precision. If
your struct elements are of an arbitrary size and packed together at
the byte level then you either have to live with the bounds giving
rights to slightly more than a single struct element, or you decide
that is unacceptable and pad the struct size up to the next
representable size (just like regular non-packed structs enforce
certain alignment), and pay a small memory overhead for that
(<0.25%). That's a security decision you can make one way or another.

Theo

Your time stamp (most likely +0000 part) confuses my Claws
Mail newsreader. I wonder if others see similar problem.

xrn on linux is not confused (which is not surprising since
linux stores time internally as GMT anyway).

Date: 11 Mar 2024 11:10:15 +0000 (GMT)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Scott Lurndal on Mon Mar 11 17:18:18 2024

On Mon, 11 Mar 2024 14:50:03 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:

Michael S <already5chosen@yahoo.com> writes:

On 11 Mar 2024 11:10:15 +0000 (GMT)
Theo Markettos <theom+news@chiark.greenend.org.uk> wrote:

MitchAlsup1 <mitchalsup@aol.com> wrote:

Theo Markettos wrote:

The bounds have a certain representation limits, because
they're packing 192+ bits of information into a 128 bit space.
This boils down to an alignment granularity: eg if you
allocate a (1MiB+1) byte buffer the bounds might be 1MiB+64
(or whatever, I can't remember what the rounding is at this
size). malloc() should ensure it doesn't hand out that memory
to somebody else; allocators typically do this anyway since
they use slab allocators which round up the allocation to a
certain number of slabs.

So how to you "encode" a petaByte array ?? of megaByte structs
in a capability ??

You create a capability with petabyte-scale bounds. The precision
of the bounds may be limited, which means that you can't ram
something else right up against the end or beginning of the array
if they aren't sufficiently aligned. This is in practice not a
problem: slab allocators will round up your address before they
allocate the next thing, and most OSes won't populate the rounded
up space with pages anyway.

When you take a pointer to an array element, then it has megabyte
scale bounds and they can be represented with more precision. If
your struct elements are of an arbitrary size and packed together
at the byte level then you either have to live with the bounds
giving rights to slightly more than a single struct element, or
you decide that is unacceptable and pad the struct size up to the
next representable size (just like regular non-packed structs
enforce certain alignment), and pay a small memory overhead for
that (<0.25%). That's a security decision you can make one way or
another.

Theo

Your time stamp (most likely +0000 part) confuses my Claws
Mail newsreader. I wonder if others see similar problem.

xrn on linux is not confused (which is not surprising since
linux stores time internally as GMT anyway).

Date: 11 Mar 2024 11:10:15 +0000 (GMT)

The issue does not appear to have anything to do with OS. It's all
about parsing of 'Date' header.

For example, in your message it looks like:
Date: Mon, 11 Mar 2024 14:50:03 GMT
Claws mail understand it.

In message of Tim Rentsch it looks like:
Date: Mon, 11 Mar 2024 07:54:07 -0700
Claws mail understand it.

In my messages format is the same as in Tim's.

In messages of Theo the header looks like a mix of yours and ours:
Date: 11 Mar 2024 11:10:15 +0000 (GMT)

The wonders of Postel's law.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to All on Tue Mar 12 19:11:08 2024

BGB wrote:

Don't you ever (EVER {E V E R}) cut anything that is no longer relevant ????? >>

Though, partly reverting the logic for the changes to the bus messaging
also did not fix the issue. Behavior is otherwise "rather weird".

So, bug hunt is being annoying.

<snip>

This is annoying...

So is 8 pages of unnecessary and unuseful text.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Robert Finch on Wed Mar 13 15:53:10 2024

Robert Finch wrote:

Can capabilities be applied to address ranges?

That is a major thing that they provide.

Segmentation similar to the PowerPC 32-bit segmentation is being used in
the current project. Where the upper address bits select a segment
register which provides more address bits. I would like to use the same descriptors for capabilities and the address range segmentation.

How would you handle 2 billion Capabilities in a single application ??

Each of which have a range of 2 GB each ??? and each containing at
least 1 M Capabilities ????

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Robert Finch on Wed Mar 13 18:47:12 2024

Robert Finch wrote:

On 2024-03-13 11:53 a.m., MitchAlsup1 wrote:

Robert Finch wrote:

Can capabilities be applied to address ranges?

That is a major thing that they provide.

Segmentation similar to the PowerPC 32-bit segmentation is being used
in the current project. Where the upper address bits select a segment
register which provides more address bits. I would like to use the
same descriptors for capabilities and the address range segmentation.

How would you handle 2 billion Capabilities in a single application ??

Each of which have a range of 2 GB each ??? and each containing at
least 1 M Capabilities ????

I should have been a bit more clear maybe, it has taken time to gel in
my head.

PowerPC-32 has only 16 segment registers. I think these could be
extended to capabilities registers in the same manner as proposed for
the FS, GS registers in x64. I wonder if there is any value in doing so though, since the address is a constant. I think it should already be
known if it would exceed a bounds. The segment registers simply tack on 24-bits to the left side of the remaining address bits to generate a
52-bit virtual address. I think all a capability would do is provide a slightly different means to calculate the address.

In the past, capability machines wanted to use capabilities for all
relocation and all protection. As long as this is the case, an applica-
tion has an unbounded need for capabilities.

You can grant this with limited capabilities (top 4-odd bits) only when
you have a means to load a new capability into a known <capability> base register[i]. Since this is privileged data, either the specified function- ality of this instruction is precisely specified and operates with access
to GuestOS address space.....it is difficult to imagine how to add Hyper- Vision on top of GuestOS supervision.

{{Or do you intend to void Hypervisors?}}

I have a couple of cores I can experiment with adding capabilities. For
my current project there are 32 segment registers.

******

I am wondering if the ‘R’ term in the CHERI concentrate expansion calc can be less than zero or if it is a modulo value. It is shown as
B[13:11] – 1. I am assuming it can go negative and is not modulo.

How “open” is CHERI ? Can CHERI based code be posted?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Robert Finch on Wed Mar 13 22:49:23 2024

On Wed, 13 Mar 2024 18:43:59 -0400, Robert Finch wrote:

I got the impression that with capabilities processor modes may not be necessary. I think the distinction between hypervisor / supervisor may
be lost. Not sure that is a good idea.

It gets rid of the hierarchy, and replaces it with a matrix.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to All on Thu Mar 14 00:01:50 2024

On Wed, 13 Mar 2024 23:18:37 +0000, MitchAlsup1 wrote:

Hypervisors are absolutely necessary if you want high RAS where a
GuestOS may crash without taking the system down.

Not really. Remember, the whole point about introducing memory protection
into multitasking, multiuser OSes in the first place was precisely so that
one program could crash without taking the system down.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Robert Finch on Wed Mar 13 23:18:37 2024

Robert Finch wrote:

In the past, capability machines wanted to use capabilities for all
relocation and all protection. As long as this is the case, an applica-
tion has an unbounded need for capabilities.

It seems like it would have a lot of overhead, but it might be worth it
for security.

You can grant this with limited capabilities (top 4-odd bits) only when
you have a means to load a new capability into a known <capability> base
register[i]. Since this is privileged data, either the specified function- >> ality of this instruction is precisely specified and operates with
access to GuestOS address space.....it is difficult to imagine how to
add Hyper-
Vision on top of GuestOS supervision.
{{Or do you intend to void Hypervisors?}}

I got the impression that with capabilities processor modes may not be necessary. I think the distinction between hypervisor / supervisor may
be lost. Not sure that is a good idea.

Hypervisors are absolutely necessary if you want high RAS where a GuestOS may crash without taking the system down. Does GuestOS use capabilities supplied
by HyperVisor for which it has no visibility ?? just prescribed uses ??

Consider a ECC error in a Capability and someone trying to use it.
Is the error charged to GuestOS who owns and manages the capability ??
or the application who is only following the prescribed uses of the
capability ?? {{The "why shoot the messenger/innocent" problem.}}

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Thu Mar 14 00:11:55 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 13 Mar 2024 23:18:37 +0000, MitchAlsup1 wrote:

Hypervisors are absolutely necessary if you want high RAS where a
GuestOS may crash without taking the system down.

Not really. Remember, the whole point about introducing memory protection >into multitasking, multiuser OSes in the first place was precisely so that >one program could crash without taking the system down.

Actually really. And modern archtitectures also protect the guest
OS from the hypervisor by providing secure enclaves (intel)/realms(arm).

The architectural features supporting virtualization are designed to
isolate guests from both the hypervisor and other guests.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Lawrence D'Oliveiro on Thu Mar 14 00:27:38 2024

Lawrence D'Oliveiro wrote:

On Wed, 13 Mar 2024 23:18:37 +0000, MitchAlsup1 wrote:

Hypervisors are absolutely necessary if you want high RAS where a
GuestOS may crash without taking the system down.

Not really. Remember, the whole point about introducing memory protection into multitasking, multiuser OSes in the first place was precisely so that one program could crash without taking the system down.

What happens to the non-HyperVised system when GuestOS goes down ??
{{Guest OS is a program, is it not ??}}

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to All on Thu Mar 14 01:59:58 2024

On Thu, 14 Mar 2024 00:27:38 +0000, MitchAlsup1 wrote:

What happens to the non-HyperVised system when GuestOS goes down ??

Nothing. That’s what makes it a “guest”.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Thu Mar 14 01:59:29 2024

On Thu, 14 Mar 2024 00:11:55 GMT, Scott Lurndal wrote:

The architectural features supporting virtualization are designed to
isolate guests from both the hypervisor and other guests.

Providing an entire separate kernel for each VM is often unnecessary. If
you need separation at the level of entire subsystems, as opposed to
individual processes, then that’s what containers are for.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Theo Markettos@21:1/5 to Robert Finch on Thu Mar 14 12:12:16 2024

Robert Finch <robfi680@gmail.com> wrote:

I am wondering if the ‘R’ term in the CHERI concentrate expansion calc can be less than zero or if it is a modulo value. It is shown as
B[13:11] – 1. I am assuming it can go negative and is not modulo.

I understand it is signed, so can go negative:

https://github.com/CTSRD-CHERI/sail-cheri-riscv/blob/6e3613a2c46fb809e526b55c5c72acb041194ab8/src/cheri_cap_common.sail#L276

How “open” is CHERI ? Can CHERI based code be posted?

CHERI is as open as we can make it:

The architecture specification is published: https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-951.pdf

We have an agreement with Arm that 'capability essential IP' (ie IP which is fundamental to the architecture, rather than details of the implementation)
is public and they won't patent it.

The Morello architecture spec is published. How Arm do their implementation
is up to them.

The CHERI-RISC-V (and MIPS) architectures are published in the above. architecture specification. The Sail formal models are open source: https://github.com/CTSRD-CHERI/sail-cheri-riscv

We have an ongoing effort to standardised CHERI through the RISC-V standardisation process.

CHERI software (compiler, toolchain, CheriBSD, application changes) and
RISC-V (and MIPS) hardware artifacts (implementation of CHERI cores in RTL)
are open source.

There's certainly no problems about posting CHERI code - to be encouraged!

Theo

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Theo Markettos@21:1/5 to BGB on Thu Mar 14 12:37:32 2024

BGB <cr88192@gmail.com> wrote:

Presumably, in addition to the code, one needs some way for the code to
be able to access its own ".data" and ".bss" sections when called.

AIUI you derive a capability from PCC (the PC capability) that gives you
access to your local 'captable', which then holds pointers to your other objects. The captable can be read-only but the capabilities inside it can
be writable (ie pointers allow you to write to your globals etc).

Some options:
PC-relative:
Unclear if valid in this case.
GOT:
Table of pointers to things, loaded somehow.
One example here being the ELF FDPIC ABI.
Reloading a Global Pointer via a lookup table accessed via itself.
This is what my ABI uses...

I couldn't seem to find any technical descriptions of the CHERI/Morello
ABI. I had made a guess that it might work similar to FDPIC, as this
could be implemented without needing to use raw addresses (and seemed
like a "best fit").

This is a description of the linkage model for CHERI MIPS; I'm not aware of anything having changed significantly for RISC-V or Morello, although exact usage of registers etc will be different.

https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/20190113-cheri-linkage.pdf

This also describes the OS-facing ABI on CheriBSD: https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/201904-asplos-cheriabi.pdf

Theo

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Theo Markettos on Thu Mar 14 16:45:57 2024

Theo Markettos wrote:

BGB <cr88192@gmail.com> wrote:

Presumably, in addition to the code, one needs some way for the code to
be able to access its own ".data" and ".bss" sections when called.

AIUI you derive a capability from PCC (the PC capability) that gives you access to your local 'captable', which then holds pointers to your other objects. The captable can be read-only but the capabilities inside it can
be writable (ie pointers allow you to write to your globals etc).

Some options:
PC-relative:
Unclear if valid in this case.
GOT:
Table of pointers to things, loaded somehow.
One example here being the ELF FDPIC ABI.
Reloading a Global Pointer via a lookup table accessed via itself.
This is what my ABI uses...

I couldn't seem to find any technical descriptions of the CHERI/Morello
ABI. I had made a guess that it might work similar to FDPIC, as this
could be implemented without needing to use raw addresses (and seemed
like a "best fit").

This is a description of the linkage model for CHERI MIPS; I'm not aware of anything having changed significantly for RISC-V or Morello, although exact usage of registers etc will be different.

https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/20190113-cheri-linkage.pdf

This also describes the OS-facing ABI on CheriBSD: https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/201904-asplos-cheriabi.pdf

So, how does a Cheri machine do::

void foo( int * i )
{
static int j;
int *k = &j;

bar( k, i );
}

And how does a Cheri machine implement fprintf( file *f, ... ) from

void printf( ... )
{
fprintf( stdout, ... );
}

Theo

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to BGB on Thu Mar 14 20:38:31 2024

On Thu, 14 Mar 2024 03:20:57 -0500, BGB wrote:

A capability effectively encodes 3 addresses:
An upper bound, lower bound, and a target address.
A segment descriptor generally only needs two:
A base address, and a size.

You need information on where the segment is located though, don’t you.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to BGB on Thu Mar 14 22:08:54 2024

BGB wrote:

I am guessing Bounds-Check-Enforce is more likely to have around a 30%

My guess would be SQRT(30%) ~= 15%

overhead, maybe more or less. But, this is likely also to be for code
that is potentially hostile. But, then, one wants the security to be
strong enough that there is no practical way for code to break out of
the sandbox; though, if allowing for arbitrary machine code, then there
is still the great potential Achilles heel that is the Global Pointer or
GOT.

Note:: GOT is not ST-able in My 66000 architecture.....You can LD it into
a Register for accessing what it points at or you can LD it into IP and
execute code over there. {No trampoline}

Only sure way to avoid this is to not have any "potentially compromising" capabilities anywhere

"within the graph of what is reachable from the hostile code" is redundant.

and the main obvious way to do this is via the use of system call.

If operating solely at the C level, it is a little easier: One needs to
make sure that there is no way for the code to get direct access to the Global Pointer or GOT or similar. An ABI based on FDPIC would be bad
here, since it is within the reach of C code (under typical C behavior,
UB notwithstanding) to be able to gain access to the GOT for an
arbitrary function pointer.

Application cannot ST to its GOT.

A big chunk of this would be overhead shared with the 128-bit ABI (which would have gone over entirely to 128-bit bounds-checked pointers), with
a few new/additional overheads.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Lawrence D'Oliveiro on Thu Mar 14 22:11:41 2024

Lawrence D'Oliveiro wrote:

On Thu, 14 Mar 2024 00:27:38 +0000, MitchAlsup1 wrote:

What happens to the non-HyperVised system when GuestOS goes down ??

Nothing. That’s what makes it a “guest”.

Ok, you are running RealOS and RealOS crashes/hangs/"does not fetch and execute instructions"--you say nothing happens--I guess you are correct in your wording but this is far from what anyone anticipates.

There is no-one to take over.......and deal with the GuestOS crash.......

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Lawrence D'Oliveiro on Thu Mar 14 22:14:57 2024

Lawrence D'Oliveiro wrote:

On Thu, 14 Mar 2024 00:11:55 GMT, Scott Lurndal wrote:

The architectural features supporting virtualization are designed to
isolate guests from both the hypervisor and other guests.

Providing an entire separate kernel for each VM is often unnecessary. If
you need separation at the level of entire subsystems, as opposed to individual processes, then that’s what containers are for.

If you are running k Linuxes under a single HyperVisor, you should be able
to share all the Linux code after giving each of them their own VaS for data.

Similarly, all library code used by the kernel should be shared uniformly across all users, too {{stdlib, libm, strlib, ...}} where each chunk of
code gets its own static (and global) variables in scope.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Lawrence D'Oliveiro on Thu Mar 14 22:16:23 2024

Lawrence D'Oliveiro wrote:

On Wed, 13 Mar 2024 23:18:37 +0000, MitchAlsup1 wrote:

Hypervisors are absolutely necessary if you want high RAS where a
GuestOS may crash without taking the system down.

Not really. Remember, the whole point about introducing memory protection into multitasking, multiuser OSes in the first place was precisely so that one program could crash without taking the system down.

Exactly the same reason HyperVisors were introduced, so GuestOSs could crash without taking down the system. A GuestOS crash does take down the applications it happens to be running at the instant of crashing.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to All on Thu Mar 14 22:47:33 2024

On Thu, 14 Mar 2024 22:11:41 +0000, MitchAlsup1 wrote:

There is no-one to take over.......and deal with the GuestOS
crash.......

You can have a management process in the host that watches for these sorts
of events, easily enough.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to All on Thu Mar 14 22:48:41 2024

On Thu, 14 Mar 2024 22:14:57 +0000, MitchAlsup1 wrote:

Lawrence D'Oliveiro wrote:

On Thu, 14 Mar 2024 00:11:55 GMT, Scott Lurndal wrote:

The architectural features supporting virtualization are designed to
isolate guests from both the hypervisor and other guests.

Providing an entire separate kernel for each VM is often unnecessary.
If you need separation at the level of entire subsystems, as opposed to
individual processes, then that’s what containers are for.

If you are running k Linuxes under a single HyperVisor, you should be
able to share all the Linux code after giving each of them their own VaS
for data.

Unnecessary to set up complete separation only to poke holes (particularly
big holes) in it to share stuff. Simpler just to create a separation setup
that only separates what needs to be separate.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to mitchalsup@aol.com on Thu Mar 14 23:47:00 2024

mitchalsup@aol.com (MitchAlsup1) writes:

Lawrence D'Oliveiro wrote:

On Thu, 14 Mar 2024 00:11:55 GMT, Scott Lurndal wrote:

The architectural features supporting virtualization are designed to
isolate guests from both the hypervisor and other guests.

Providing an entire separate kernel for each VM is often unnecessary. If
you need separation at the level of entire subsystems, as opposed to
individual processes, then that’s what containers are for.

If you are running k Linuxes under a single HyperVisor, you should be able
to share all the Linux code after giving each of them their own VaS for data.

Bad idea. Single point of failure. Impossible to update one without
updating all. Linux does update code dynamically when loading and
unloading kernel modules.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Chris M. Thomasson on Fri Mar 15 01:32:14 2024

On Thu, 14 Mar 2024 17:21:29 -0700, Chris M. Thomasson wrote:

On 3/14/2024 3:47 PM, Lawrence D'Oliveiro wrote:

On Thu, 14 Mar 2024 22:11:41 +0000, MitchAlsup1 wrote:

There is no-one to take over.......and deal with the GuestOS
crash.......

You can have a management process in the host that watches for these
sorts of events, easily enough.

Watchdog, tick tick... ;^)

Event-driven would be more efficient and more responsive than periodic
polling.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Fri Mar 15 14:39:52 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Thu, 14 Mar 2024 17:21:29 -0700, Chris M. Thomasson wrote:

On 3/14/2024 3:47 PM, Lawrence D'Oliveiro wrote:

On Thu, 14 Mar 2024 22:11:41 +0000, MitchAlsup1 wrote:

There is no-one to take over.......and deal with the GuestOS
crash.......

You can have a management process in the host that watches for these
sorts of events, easily enough.

Watchdog, tick tick... ;^)

Event-driven would be more efficient and more responsive than periodic >polling.

That assumes that an event can be generated, which may not be possible
with a guest os crash (if, for example, it was in an infinite loop
with interrupts disabled).

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Fri Mar 15 21:14:49 2024

On Fri, 15 Mar 2024 14:39:52 GMT, Scott Lurndal wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

Event-driven would be more efficient and more responsive than periodic >>polling.

That assumes that an event can be generated, which may not be possible
with a guest os crash (if, for example, it was in an infinite loop with interrupts disabled).

It wouldn’t be a “guest OS”, it would be a “guest container”. Remember,
the processes in the container are isolated from the host, but the host continues to have full visibility into the guest.

For a container, guest termination is synonymous with termination of its guest-specific “init” (container-internal PID 1) process. A host watcher process can monitor several of these at once via the Linux pidfd mechanism <https://manpages.debian.org/2/pidfd_open.en.html>.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From John Dallman@21:1/5 to Theo on Sun Mar 17 11:52:00 2024

<https://www.capabilitieslimited.co.uk/_files/ugd/f4d681_e0f23245dace46629 7f20a0dbd22d371.pdf> wrote:

Fontconfig's serialization code heavily relied on being able
to create pointers from arbitrary pointer arithmetic, and this
is not compatible with CHERI.

Can you describe this in more detail? The library I work on manages its
memory in a very detailed way. It asks the host application to allocate largeish blocks for it, and then subdivides the blocks itself, in many
ways, creating lots of pointers to entities within them.

That sounds a lot like "create pointers from arbitrary pointer
arithmetic." It's somewhat like the way a C run-time library gets memory
from the OS and manages the heap, only with specialisation for the sizes
of memory blocks that are used in large numbers. This is done to avoid
horrific heap fragmentation that would otherwise happen.

Presumably, this will require support in the library, analogous to the
way support is required in glibc?

Presumably sizeof(void*) == 16? Do any pointer types remain 8-byte, or do
they all grow to 16 bytes?

Given that Morello will run non-CHERI ARM64 code, how are the transitions between CHERI and vanilla code handled? Must a process be one or the
other?

In article <Ory*7U4Ez@news.chiark.greenend.org.uk>, theom+news@chiark.greenend.org.uk (Theo) wrote:

Should be - it's mostly making things play by the rules. Once they
play by the rules then it means they will work the same (or less
buggily) on a regular C platform.

The above link describes the changes - a number being replacing
'long' with intptr_t, some undefined behaviour, bad use of
realloc().

The things that look challenging on a skim of the report are both in the "language run-time" category.

We have our own implementation of printf, which has a lot of machine-
specific code in it. That's done so that it can be extended with new
formatter codes at run-time: we add a few hundred formatters that know
about lots of internal types, many of them large structures. That tells
me that I'm going to have to treat the CHERI-enabled version of a
platform as a distinct platform, but that was likely inevitable anyway.

The interfaces that let our LISP interpreter in the test harness call C
code, and vice-versa are also a potential problem.

That realloc() bug was a beauty. It was a silly piece of code anyway, but
would work on most flat-address-space machines without capabilities.

Some of it was modernisation of old codebases (eg add C11 atomics),

To replace GCC __sync_* intrinsics, I see. Probably a good idea anyway,
and worth investigating.

Do you want to compartmentalise that shared library, ie put in trust boundaries between the library and its caller?

No, so things should be straightforward, provided the linker and run-time
are willing to include the library in the same compartment as its caller.
I presume that's possible?

If you just want to run the shared library as-is, you can recompile
it and get bounds checking etc.

Compartmentalising the library's data reading is superficially attractive,
but it's pretty complicated. Specifically, expanding the save format to
the in-memory format can use large chunks of the library's code that are
also used in doing operations on in-memory data in response to API calls.

I believe Morello Linux is able to support console-mode apps - ie
it has support context switching and use of capabilities in userspace,
with some support in glibc. I believe there is now a dynamic linker,
but not sure of the status.

To do porting work, I'd need X11 for debugging graphics, and some
prospect of commercial demand for CHERI-enabled libraries. Here, the
familiar chicken-and-egg problem with new architectures rears its head.

That might be overcome if, for example, ARM Ltd was to adopt a
CHERI-based architecture as ARMv10. But something like that would be
needed to build commercial momentum. Without such a commitment, CHERI may
well fade away.

John

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Scott Lurndal on Mon Mar 18 01:29:03 2024

Scott Lurndal wrote:

mitchalsup@aol.com (MitchAlsup1) writes:

Lawrence D'Oliveiro wrote:

On Thu, 14 Mar 2024 00:11:55 GMT, Scott Lurndal wrote:

The architectural features supporting virtualization are designed to
isolate guests from both the hypervisor and other guests.

Providing an entire separate kernel for each VM is often unnecessary. If >>> you need separation at the level of entire subsystems, as opposed to
individual processes, then that’s what containers are for.

If you are running k Linuxes under a single HyperVisor, you should be able >>to share all the Linux code after giving each of them their own VaS for data.

Bad idea. Single point of failure. Impossible to update one without updating all. Linux does update code dynamically when loading and
unloading kernel modules.

I actually have a 4-level system:: HyperVisor is the only layer that is not allowed to crash (RISC-V calls this machine). Progressing towards less privilege
is GuestHV, GuestOS, and Application. Hypervisor provides only memory, timing, and device identification services. GuestHV provides what most would call the HyperVisor, GuestOS woudl be LINUX, and everybody knows what an application is.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to mitchalsup@aol.com on Mon Mar 18 14:17:30 2024

mitchalsup@aol.com (MitchAlsup1) writes:

Scott Lurndal wrote:

mitchalsup@aol.com (MitchAlsup1) writes:

If you are running k Linuxes under a single HyperVisor, you should be able >>>to share all the Linux code after giving each of them their own VaS for data.

Bad idea. Single point of failure. Impossible to update one without
updating all. Linux does update code dynamically when loading and
unloading kernel modules.

I actually have a 4-level system::

That is completely orthogonal to the idea of sharing linux code between
guests.

ARMv8 has a similar 4-level (5, if your counting the machine layer):

- Machine (secure) (e.g. SMM)
- Hypervisor (non-secure)
- Nested Hypervisor (non-secure) [Optional, and not yet widely used]
- Guest OS (or Bare metal if there is no hypervisor)
- User mode

HyperVisor is the only layer that is not
allowed to crash (RISC-V calls this machine).

Progressing towards less privilege
is GuestHV, GuestOS, and Application. Hypervisor provides only memory, timing, >and device identification services.

Absent universal SR-IOV, the hypervisor also needs to manage shared
I/O access (keyboard controller, graphics, disk and networking)
and with SR-IOV, creation and assignment of virtual functions.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Keyop
  Thu Jun 13 19:14:27 2024
  from Huddersfield, West Yorkshire via SSH
- Bob Worm
  Wed Jun 12 22:31:20 2024
  from Wales, Uk via Raw
- Cronus
  Wed Jun 12 16:52:55 2024
  from Provo, Ut via SSH
- Bob Worm
  Wed Jun 12 16:16:25 2024
  from Wales, Uk via Raw
- Bob Worm
  Wed Jun 12 14:43:16 2024
  from Wales, Uk via Telnet
- Keyop
  Wed Jun 12 12:51:51 2024
  from Huddersfield, West Yorkshire via SSH
- Bob Worm
  Tue Jun 11 21:37:59 2024
  from Wales, Uk via Telnet
- Keyop
  Tue Jun 11 20:48:11 2024
  from Huddersfield, West Yorkshire via SSH

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	307
Nodes:	16 (2 / 14)
Uptime:	81:50:29
Calls:	6,916
Calls today:	1
Files:	12,382
Messages:	5,433,165

Capabilities, Anybody?

Who's Online

Recent Visitors

System Info