CHERI with a Linux on top [LWN.net]

By Jake Edge
September 24, 2025

The Capability
Hardware Enhanced RISC Instructions (CHERI) project is a rethinking of
computer architecture in order to improve system security. Carl Shaw gave
a presentation at
Linux
Security Summit Europe (LSS EU) about CHERI and the efforts to get
Linux running on it. He introduced capabilities,
which are a mechanism for access control, and outlined their
history, which goes back many decades at this point, then looked more
specifically at the CHERI project and what it will take to apply the
security constraints of capabilities to an operating system like Linux.

Capabilities

At its core, CHERI
is about extending instruction-set architectures (ISAs) to add support for
capabilities. A 1966 paper, “Programming
Semantics for Multiprogrammed Computations“, introduced the idea of capabilities, along with
many of the ideas that would later underlie Unix. The paper had a strong
focus on security and ensuring that computations did not interfere with
each other; it generalized some ideas from
earlier computers like Atlas, Rice
Computer, and various Burroughs
machines into what the authors called “capabilities”. “Processes need to
own capabilities to be able to do something on a system.”

$ sudo subscribe today

Subscribe today and elevate your LWN privileges. You’ll have
access to all of LWN’s high-quality articles as soon as they’re
published, and help support LWN in the process. Act now and you can start with a free trial subscription.

A capability is a reference and a set of rights; “a capability is an
access-control object“. It was originally applied to memory, but the paper
expanded the idea to cover I/O and other system resources. For memory, which he
was focusing on for the talk, the reference is to a region of memory and
the rights are permissions to read, write, and execute it.
More formally, “a capability is an unforgeable, transferable token that
authorizes the use of an object“, he said.

An object capability of that sort incorporates both a reference to the
object and access rights for that object.
The paper used a list of capabilities that a process had
access to, which was called the “C-list”. Each entry was a capability,
with a reference to a memory segment and the permissions for it. So access to memory required an
indirection through the C-list table, which turned out not to perform well.

He mentioned a few of the early hardware implementations of capabilities,
starting in 1970, though he said there were some slightly
earlier machines in the US. The CAP computer was from
Cambridge University; the “first ever commercial
capability-based system” was the Plessey System
250, which was not a general-purpose computer and was originally used
by the military for message routing. It did have many of the attributes of
modern computers, such as virtual memory and symmetric multiprocessing;
“it was a pretty far ahead machine for its day“.

A less-successful capability-based CPU is the Intel iAPX 432 from
1975, which ended up only being used in niche applications. Its
performance was poor, mainly due to the indirection required to access
memory. More recently, the Arm Morello CPU in
2022 was the result of a research project between the company and the UK
government; it added CHERI on top of an Arm
Neoverse processor. It was developed on a short time scale of about a
year, so compromises inevitably had to be made, Shaw said, but “they did
a really good job on it“; it is still used for research, but newer
CHERI implementations have narrowed their focus to a smaller, more commercially
viable subset of capabilities than the Morello has.

There were a number of operating systems developed using capabilities, “some you’ve probably never
heard of“, including KeyKOS, EROS, and CapROS, which were mostly “focused around high levels of reliability“.
In modern times, seL4 uses capabilities
and, this year,
it is joined by CHERI-seL4.

But, his talk was aimed at Linux, he said. Linux already has some vestiges
of capabilities, including things like socket and file descriptors, which can
be passed around to other processes to bestow rights. Kernel
capabilities are not true capabilities in his mind, but
page-table entries are a form of capabilities: they have a reference to a memory region and
associated permissions.

CHERI

“CHERI is a new implementation of capabilities“; it is a security
technology that is designed to be scalable, so that it can be used in
everything from microcontrollers to server-class hardware. It is
deterministic; CHERI does not rely on any hashing or secrets. “It’s
very much a hardware/software co-design technology, as well,” Shaw said.

Capability-based
addressing is used by CHERI, which is a variant without C-lists, so
it does not suffer the performance penalties for indirection.
CHERI extends existing ISAs. It started by extending MIPS, then
Morello extended Arm, and now most of the work being done is for RISC-V;
there is also an initial sketch of how the x86 ISA could accommodate CHERI,
he said. CHERI takes a hybrid capability approach, so that it can work with
existing systems as they are; it accommodates memory-management units
(MMUs), hypervisors, and existing programming languages.

The CHERI instructions do not use integer address pointers, they use
capabilities for addresses instead. Existing code will still run on a
CHERI system, using addresses the way it currently does, but it will not
get the benefit of the CHERI protections.

The project was started 15 years
ago by Cambridge University and SRI International funded by DARPA. The CHERI Alliance is the focal point of
current research, which is being funded by both governments and companies.

The goals of CHERI are to provide memory safety for languages like C
and C++, though it will also benefit others, including Rust, while also
offering “fine-grained, efficient compartmentalization“. There is
already coarse-grained compartmentalization in today’s systems, including protection (or
privilege) rings and MMUs protecting processes from each other, but
CHERI is “designed to be very very fine-grained, down to the byte
level, if you wanted to go there“.

The intent was for existing code to run unchanged, “but it never works
like that“. For most well-written C and C++ application code, a
recompile is largely all that is needed to work on a CHERI system. For
example, KDE was ported to CHERI on Morello and only required changes to
0.02% of the code to get it working. For things like language runtimes,
JIT compilers, memory allocators, and code that does lots of pointer
manipulation, such as kernels, it gets more complicated. Beyond Linux, FreeRTOS, Zephyr, and (as mentioned) seL4
have all been ported to run on CHERI hardware; other operating systems are
in progress as well.

The instructions that CHERI adds to the ISA are for creating and modifying
capabilities; the modifications are operations that are normally expected
for pointers, such as incrementing and other arithmetic operations. The
hardware itself needs to be changed to support capabilities; registers need
to be extended to hold them, for example.

There is both a pure-capability (purecap) mode, where only capabilities can
be used for memory access, and an integer-pointer (integer) mode, which
uses regular pointers. There is actually a way to have both in a single
program, with pointers annotated based on which type they are, but it is
not recommended, Shaw said. On the CPU, there is a mode switch that is
made between the two modes, which is particularly important on RISC-V to
save space in the ISA encoding; for example, load and store
instruction encodings are shared between the modes.

On a CHERI system, all loads and stores are checked against a capability,
even when running in integer mode. There is a program-counter capability
(PCC) for both modes, and a default data capability (DCC) for integer mode,
which allows the accesses from programs in integer mode to be constrained;
“we can set where it can execute and we can set what it can see in
memory“.

In terms of implementation, a capability is an address that has been
extended with metadata, but it is important to think of them as a single
unit. There is a bounds field, which holds upper and lower bounds for
memory addresses, and a permissions field that has the usual read, write,
and execute permissions as well as some others, including whether you can
store the capability to memory or not.
On the CHERI RISC-V, capabilities are 128 bits in size; all of the
registers and caches need to accommodate pointers of that size. There is
also an out-of-band single bit tag that is used to indicate whether the
contents of memory or a register contain a valid capability or not;
software generally does not need to interact with the tag directly, he said.

Originally, capabilities were 256 bits so that they could included full
64-bit upper and lower bounds. “One of the innovations of CHERI is to use a
compressed format for bounds“; the CHERI RISC-V uses a
mantissa-exponent system, which reduces the resolution but that is not much
of a problem on a virtual-memory system, he said.

There are some rules for using capabilities in CHERI, starting with the
provenance rule: a capability can only be created using another valid
capability. The monotonicity rule says that a new capability can only have
the same or lesser rights than the capability it is created from. The
reachable capability monotonicity rule disallows increasing the reachable
capabilities for a given chunk of code
without yielding execution to
another domain. The code only has access to a limited set of
capabilities, but if it takes an interrupt, that will run in a different
domain, which could perhaps increase the capabilities available to it.

When the system boots, it has access to the “infinite cap” (or “root cap”),
which is all of the permissions for all of memory; it is generally stored
in the PCC. As an example, the system could then create two compartments
by creating sub-capabilities that were more restricted; each could have
non-overlapping bounds, and one region could perhaps be for code, so it
only has read and execute permissions. Then, inside the other region, a
read-only array capability could be created; anything having that
capability can read the array, but nowhere else in the enclosing region.

Most of the “heavy lifting” for setting up the capabilities is done
by the compiler, Shaw said. For example, a static C array will have a
capability created for it by the compiler, which is how CHERI can provide
memory safety to C code. The program cannot successfully read or write
outside of the array because the capability it must use to access the array will not allow it to do so. Stacks
can be made non-executable by removing that permission from the capability
for the stack frame, for example.

Linux

CHERI provides run-time memory safety that is hardware-enforced, which is
critical for C and C++ programs. The Linux kernel is mostly implemented in
C so getting memory safety for it requires a tool like CHERI. In addition,
CHERI allows implementing least-privilege
compartmentalization. There have been supply-chain attacks against
libraries, which CHERI could protect against by putting “a
library into a sandbox, mostly automatically, which can constrain its
access and its entry and exit points“. Within the kernel, a similar
approach can be taken by placing subsystems and drivers inside
sandboxes. An analysis
of kernel bugs in 2022 showed that 87% of the high-severity kernel CVEs
could be mitigated with either memory safety or compartmentalization;
“we see that as a pretty important thing to try and achieve“.

About two weeks before his talk in late August, CHERI Linux developers got
the 6.16 kernel running in purecap mode; “so this means that every
pointer in the kernel is now a capability“. Originally, Huawei did a
proof of concept of Linux running on CHERI, then the Morello project ported
Linux to that hardware; the Morello version used the hybrid mode, where
most of the pointers were still integers, though the system-call level used
capabilities.

His employer, Codasip, has a team that is working on Linux for CHERI on
RISC-V; it started with the hybrid Morello kernel, but then did a clean
implementation in purecap mode. “We do not claim it’s perfect, what
we’re aiming for at the moment is functionality; we want to get the basics
running, then we’re gonna go on to the more advanced security
concepts.” Some of those advanced techniques have already been proved
on FreeBSD in CheriBSD, he said,
but not on Linux yet.

Testing of the kernel has been done using the Linux Test
Project (LTP), which is not all passing, yet, but “it’s looking
pretty reasonable“. On the user-space side, there is a “relatively
simple” purecap version; it does not yet have the GNU C library (glibc)
but is using musl libc. His team is focused on the kernel, core libraries,
and utilities, at this point, he said.

He went through a list of various kernel features, briefly reporting on
their status; many things are working already, including networking, BPF,
USB, and PCIe. There is a “rather dated X11 system working“. The
team has also started some optimization work, especially with regard to
copying memory to and from user space. In addition, the CHERI architecture
allows doing 128-bit loads and stores, which can accelerate functions like memcpy().

There is other development work going on as well, such as on the LLVM
compiler for RISC-V CHERI and on QEMU for running and testing the system.
The CHERI Alliance GitHub
repository is where all of the work is being done.

The ABI being used is the Pure Capability user-space ABI (PCuABI) defined
by the Morello project three years ago. It uses capabilities at the
system-call level, which constrains what each side of the ABI can do.
Copying to and from user-space memory is constrained by the bounds and
permissions of the capabilities, while returned capabilities, such as from
mmap(),
restrict user space.

There are a number of challenges for purecap CHERI in the kernel, starting
with the use of unsigned long for pointers. That type is used for
pointers all over the kernel, but the CHERI compiler needs them to be a
uintptr_t so that it can use capabilities instead. There are also
alignment and size problems that come from the larger size of capabilities;
structures in the kernel sometimes assume pointers have a specific size.
The goal is to minimize the changes that need to be made and to make them
with an eye on what can go upstream eventually.

The next piece that his team plans to look at is loading kernel modules
into compartments. It is a tricky problem, since kernel modules “have
to have quite a lot of access within the main kernel” Another “big
ticket item” that needs to be tackled is support for BPF in user space.
The BPF compiler has no conception of capabilities, which
needs to be addressed; there is also the question of backward compatibility
for existing BPF binaries. The work done in the CheriBSD project is useful
as a reference, he said.

An area where CHERI could help is with Linux on MMU-less systems. Those
systems lack the process isolation that is provided by the MMU, but CHERI can
provide hardware-enforced isolation. An MMU also provides translation of
addresses to and from virtual and physical, which is not something CHERI
can do, but there is some interesting work in academia that might help.
“CHERI is sort of refreshing some ideas and getting people to look back
at these sorts of issues“, he said.

A related idea is to use CHERI for a single-address-space Linux targeting
workloads with many processes sharing the same data. CHERI would be used
for isolation, and the MMU for translation, but the shared data would be
accessible without changing translation-lookaside buffers (TLBs), so it
would reduce TLB thrashing.

Codasip has designed a CHERI CPU, the X730, “from the ground up“; part of what
the company does is to create configurable cores, where features like CHERI
can be turned on or off when the CPU is built. That makes it easier to
compare performance between the two; performance of CHERI CPUs is a question
that the project frequently gets asked. The X730
only requires less than 5% more silicon area for CHERI, compared to the
A730 non-CHERI version; it can run at the same
maximum frequency for both types. The X730 adds 3.8% overhead for the CHERI
instructions and overall has a less than a 5% performance overhead for CHERI code.
The team is still working on optimizations and thinks it can reduce that
overhead further.

He wrapped up by returning to the paper from 1966, whose authors stressed
that “multiprogrammed computer systems” would need to evolve over time to
meet changing requirements. That is what the CHERI project is trying to
do, Shaw said, by evolving both hardware and software to try to improve the
security of computer systems.

Q&A

The first question was in relation to DARPA, which regularly has
initiatives toward memory safety; the most recent is the Translating
all C to Rust (TRACTOR) program, which is looking to automate
that transition. If it is successful, “what role do you see CHERI
playing in an environment where a majority, even a vast majority, of all C
code has been replaced with Rust?” Shaw said that he wonders how
successful TRACTOR will be, given that AI techniques may fall short of
being able to reliably translate C for all of the different programs needed.
Meanwhile, though, he does not see CHERI and Rust as being in conflict at
all; the two can work together and it is something the project is putting
effort into. “There will be a CHERI Rust compiler.”

While memory safety is definitely important, the compartmentalization
afforded by CHERI is more interesting to him. “Being able to get least
privilege in software is a real big step forward, I think.” None of the
current languages attack that problem, he said, so it would take “a
further evolution of language in order to support this whole concept
nicely“.

Another attendee warned Shaw about what Arnd Bergmann said in a talk earlier in the week: the
existence of MMU-less Linux is slated to end in 2028 or so. He suggested
that Shaw talk to Bergmann about those plans. Shaw said there is a niche
for MMU-less CPUs, especially for network gear, such as routers, that is
driven by trying to keep the costs as low as possible; ideally, the
manufacturers want Linux, but will presumably choose something else if they
have to.

The attendee asked about the memory overhead for CHERI, which Shaw said he
did not have any real numbers on, since the team has just started gathering
that kind of information. The tags add some overhead, but that is
typically less than 1% of the size of memory. Pointer-heavy workloads will
obviously have a larger increase in memory than computation-oriented
workloads, he said.

The compiler being used is LLVM, he said in response to another question;
the work there is starting to go upstream and is the first of the CHERI
Linux work to do so. The CHERI Linux project has to adapt its strategy for
getting code upstream, depending on the target project; LLVM is a large
project so the efforts so far have been to get the infrastructure needed
for CHERI upstream. Some of that work will show up in LLVM 22, he thinks.

LSS EU organizer Elena Reshetova noted that she agreed that the
compartmentalization aspects of CHERI were the more interesting and
wondered what progress had been made on that. Shaw said that they are just
getting started on compartmentalizing for Linux; the first steps will be
for user-space libraries, which are already pretty well understood from the
CheriBSD work. Kernel-module compartmentalization is the other thing being
pursued, as he mentioned earlier.

He agreed with Reshetova that
compartmentalizing the kernel would be “very challenging“. She
followed up by asking if it even made sense to pursue for Linux;
“given that it never has been considered in the design, this is a pretty
fundamental change“. Shaw thought that it might be possible, at least
based on what the hardware-assisted
kernel compartments (HAKC) project has been doing. “We think it’s
at least to some extent achievable.”

James Morris asked about the relationship between the Morello and CHERI
Linux projects; “are you joining forces to do the upstreaming?” Shaw
said that “the CHERI community is pretty tight-knit“. His team
works closely with teams for Morello, CHERIoT, and others, including lots of
collaboration with Cambridge University people. The project mostly has
participants in the US and UK, but that is changing and there is more
commercial interest in the project everywhere, he said.

Another question was about how people could emulate CHERI hardware to try
things out. Shaw said that there was currently a 6.10 kernel in the CHERI
Alliance repository, but that the 6.16 kernel should be pushed there soon.
It will come with all of the scripts needed for building everything with
Yocto, including development tools, like the toolchain, SDK, QEMU for
CHERI, and so on. That will be a good starting point for those wanting to
check out CHERI Linux.

The
slides
and a
YouTube video of
the talk are available for interested readers.

[I would like to thank the Linux Foundation, LWN’s travel sponsor, for
supporting my trip to Amsterdam for Linux Security Summit Europe.]

Source link

DZdano

Administrator

Visit Website View All Posts

Leave a Reply Cancel reply

Related Stories

Michigan vs UConn score, highlights in March Madness national title game – Detroit Free Press

Michigan vs UConn highlights, score as Wolverines win NCAA title – Bergen Record

[News] DDR5 Retail Prices Pullback Amid Market Correction, but Industry Players Cite Stable Contract Trends – TrendForce

You may have missed

Resident Evil Mod Channel Nuked For NSFW Mods

Blue Prince Devs Want You To Go Into These Indie Games Blind

Nintendo Keeps Leaving Cool Retro Gadget Ideas On The Table

Overwatch’s Switch 2 Version Literally Made Me Recoil With Disgust