Skip to main content

MEP 41. Memory Safety: capability handles, generational references, and CISA-roadmap alignment

FieldValue
MEP41
TitleMemory Safety
AuthorMochi core
StatusDraft
TypeStandards Track
Created2026-05-18
DependsMEP-15 (Effects), MEP-16 (Null Safety), MEP-40 (vm3)

Abstract

Mochi already has most of a memory-safe runtime by accident of MEP-40's vm3 design. The 8-byte handle Cell is mathematically a Vale generational reference (verdagon.dev, 2021/2023), the 12-arena partition is structurally a kalloc_type / xzone typed-allocator (Apple MIE, Sept 2025), and the four-bit arena tag plus 12-bit generation plus 32-bit slab index encoding is an MSWasm handle ABI (POPL 2023; Iris-MSWasm OOPSLA 2024) in everything but name.

What is missing is the spec that names this property, the verifier rules that enforce it as a language-level invariant, and the surface-language vocabulary (consume / borrow / inout / weak) that lets Mochi authors opt into stricter modes when they want to. This MEP supplies all three, plus a hardening checklist for the vm3jit code page (W^X, PAC, BTI, Intel CET, Spectre index masking) so the JIT does not become the memory-safety escape hatch the rest of the runtime closed.

The framing target is the CISA Secure-by-Design Pledge memory-safety roadmap obligation (January 1 2026 deadline, ~300 signers as of May 2026). Mochi is not a pledge signer (Mochi is a language, not a vendor), but downstream organizations that adopt Mochi are signers. MEP-41 produces a one-page public statement those downstream organizations can attach to their roadmaps. The statement claims parity with the standard CISA-named memory-safe set (Rust, Go, C#, Java, Swift, Python, JavaScript) by exactly the same provenance chain those languages use: language-level safety rests on runtime correctness, runtime is implemented in a memory-safe host language (Go), host runtime correctness is taken on the same footing as the JVM for Java or CPython for Python.

This MEP is a Standards Track design document. The phased plan (Phase 0 spec freeze through Phase 7 public statement) ships incrementally alongside MEP-40 vm3 work. No new opcodes are required; the verifier changes are pure rule additions. The reference-mode surface syntax is the only user-visible change, and it is opt-in.

Motivation

Why now

Three things happened between 2024 and 2026 that make this MEP unavoidable.

CISA Secure-by-Design Pledge, January 1 2026 deadline. The pledge is voluntary, but the language is unambiguous: failing to publish a memory-safety roadmap "significantly elevates risk to national security." As of May 2026 the signer count is ~300, including Microsoft, Google, AWS, IBM, Cloudflare, GitHub, Fortinet, BeyondTrust, HPE, BitGo (CISA, May 2024 announcement plus subsequent updates). DoD SWFT (2025) rewards Secure-by-Design alignment with faster authorization for products on DoD networks. EU CRA reporting obligations begin September 2026. Any organization that wants to adopt Mochi in a regulated context needs to point at a Mochi memory-safety statement, the same way they point at a Rust or Go statement today.

Apple Memory Integrity Enforcement (MIE), September 9 2025 (iPhone 17 / A19). MIE = Enhanced MTE plus typed allocators (kalloc_type, xzone) plus Tag Confidentiality Enforcement (TCE). It is the largest single deployment of a hardware-assisted memory-safety system in history, runs always-on in synchronous mode kernel-wide and across 70+ userland processes, and Apple credits it with breaking essentially every recent mercenary-spyware exploit chain. The vm3 handle is structurally analogous to MIE's pointer tag, and the vm3 arena is structurally analogous to a typed allocator. The TCE lesson (never let user code observe a raw tag value) is directly applicable to vm3 generation numbers and must be enforced by the verifier.

Android memory-safety crossover, 2024. New Android memory-safety vulnerabilities dropped below 20% of all CVEs in 2024, driven by the shift to Rust + Kotlin for new code (Google Security Blog, September 2024). This is the empirical proof that the "stop writing new unsafe code" thesis works at scale, and it raises the bar for what a new language project should claim about memory safety at version 1.0.

What Mochi already pays for

vm3 (MEP-40) shipped with a runtime that is, line-for-line:

  • A generational reference machine in the Vale sense. The 12-bit gen field in every arena slot is Vale's currentGeneration; the 12-bit gen packed into every handle Cell at genShift=32 is Vale's rememberedGen; the per-deref slab lookup in accessors.go is Vale's assert(curr == remembered); the gen bump on slot reuse in alloc.go is Vale's free-time increment. Vale's published overhead on the BenchmarkRL terrain generator is 2 to 10.84% (verdagon.dev/blog/generational-references). vm3 inherits the same overhead curve.
  • A software capability machine in the MSWasm sense. The handle Cell is an unforgeable reference value (tag, gen, idx) with no public constructor; the verifier already refuses any instruction that synthesizes a Cell from raw bytes; access goes through runtime/vm3/accessors.go which does the equivalent of MSWasm's segment-memory bounds check.
  • A typed allocator in the kalloc_type / xzone sense. Twelve distinct arena types (vmString, vmList, vmMap, vmSet, vmStruct, vmClosure, vmBignum, vmBytes, vmPair, vmF64Arr, vmI64Arr, vmU8Arr) live in separate Go slabs. Type confusion across arenas is structurally impossible because the 4-bit arena tag in the handle decides which slab to index, and every accessor dispatches on tag before touching memory.

None of this is documented as a memory-safety property anywhere in the Mochi tree as of May 2026. MEP-40 mentions Cell layout as a performance property. MEP-41 names it as a safety property.

What we are NOT claiming

Mochi is not Rust. There is no aliasing-XOR-mutation discipline at compile time. There is no data-race-freedom proof. There is no formal Iris mechanization (yet; see §10 open questions). Concurrency is single-VM (MEP-40 §3). The runtime is implemented in Go, so the safety chain bottoms out at the Go runtime correctness exactly as Python's bottoms out at CPython.

We are claiming exactly the property Vale claims: no use-after-free, ever, on any reference, with measured ~2-10% overhead. Plus, by virtue of the 12-arena partition: no type confusion across container kinds. Plus, by virtue of MEP-16 (Null Safety): no null dereference, ever. These three together is the CISA-aligned memory-safety statement.

Scope

In scope:

  • A bytecode verifier rule set that makes handle integrity an invariant. Every runtime/vm3 opcode that produces a Cell either copies an existing Cell or invokes a constructor in runtime/vm3/alloc.go. Hand-rolled MakeHandle(tag, gen, idx) is reserved to package-internal runtime/vm3 code and is unreachable from any bytecode path.
  • A surface-language reference-mode vocabulary: consume, borrow, inout, weak. Each is purely a verifier annotation; no bytecode changes. Each maps to a specific elision opportunity (borrow elides generation check inside the borrowed scope; consume converts a heap Cell into single-owner with deterministic free) or a strictness opportunity (weak refuses gen-check elision even when escape analysis would allow it).
  • Generation-as-secret hygiene: a verifier rule that refuses any opcode whose semantics would let user bytecode observe a raw generation value or compute differences between generation values across allocations. This is Mochi's analog of Apple TCE.
  • FFI sealing: a seal(handle, key) / unseal(handle, key) pair gated on MEP-15's meta effect that swaps the arena tag to a reserved "sealed" variant during a Go-function call. Sealed handles cannot be dereferenced; only the matching unseal opcode can reopen them. This is CHERIoT's sealed-capability mechanism at the VM layer.
  • vm3jit hardening: W^X discipline on the JIT code page (MAP_JIT + pthread_jit_write_protect_np on darwin/arm64, dual-mapping or W^X switching on linux/amd64), PAC + BTI annotations on every JIT-generated function entry on darwin/arm64, Intel CET shadow-stack and IBT on amd64 where available, Spectre v1 index masking on every array load, retpoline-or-equivalent on indirect branches in untrusted-input code paths.
  • A public Mochi memory-safety statement (docs/security/memory-safety.md) suitable for downstream organizations to attach to CISA roadmaps. Phrasing target: "Mochi is designed to enable signatories of the CISA Secure-by-Design Pledge to use it as part of their memory-safety roadmap, equivalent to selecting any other named memory-safe language."
  • A threat model document that enumerates the boundaries: trusted Mochi bytecode (verified), trusted Go runtime (assumed correct, same as JVM for Java), trusted vm3jit code page (verified at lowering time, W^X enforced), untrusted FFI (sealed by default), untrusted external data (parsed via verifier-checked constructors).

Out of scope (deferred to successor MEPs):

  • Aliasing-XOR-mutation enforcement. Borrow rules in MEP-41 are advisory: they unlock optimization (gen-check elision) but their violation is a runtime panic, not a compile-time error. Static borrow checking is MEP-50+ territory and depends on whether Mochi gets a real ownership system at all.
  • Data-race-freedom. Mochi is single-threaded per VM in MEP-40. Concurrent Mochi is a separate MEP that will need Verona-style regions or Pony-style reference capabilities; whichever, it does not block this MEP.
  • Hardware CHERI / Morello back end. vm3 is structurally compatible with a CHERI target (its handle Cell is a soft CHERI fat pointer), but generating CHERI-LLVM is a separate code-generation MEP and not required to claim language-layer memory safety.
  • Formal Iris / RustBelt mechanization of vm3. Iris-MSWasm and Vale's recent OOPSLA / SPLASH work show the path; this is MEP-60+ territory. We claim operational memory safety in this MEP, not mechanized proof.
  • Capability-based effects (Austral / OPAW). MEP-15 effects are labelled, not capability-passed. Promoting effects to capabilities is a separate MEP.
  • A garbage collector with stop-the-world or concurrent-marking strategy beyond what MEP-40 Phase 6 ships. MEP-41 inherits whatever collector MEP-40 produces and adds quarantine / delayed-reuse on top.

Background: the 2026 memory-safety landscape

The material below summarizes the 57-file research substrate in ~/notes/Spec/5400/ produced for this MEP. Each [[citation]] points at a deep-dive in that tree; URLs and provenance are in the citation files.

1. Hardware capability machines, software capability ABIs

  • CHERI / Morello / CHERIoT 1.0 (Nov 3 2025) / Codasip X730 (April 2025, first commercial silicon path). Unforgeable 128-bit fat pointers (address + bounds + permissions + 1 tag bit out-of-band). Object-granularity spatial safety, deterministic UAF, lightweight compartments. Codasip + EnSilica announced the first mass-distribution CHERI-RISC-V part on Nov 19 2025. Morello has no commercial roadmap from Arm. CHERIoT targets sub-256 KiB SRAM embedded devices.
  • MSWasm (POPL 2023; Iris-MSWasm OOPSLA 2024). Wasm extended with a segment memory accessible only via unforgeable handle values (base, offset, bound, valid, id). Tagged-byte storage prevents handle forgery from numeric writes. Software overhead 22% spatial-only to 198% full; hardware MTE / CHERI lowers this near zero. Iris-MSWasm is a full Coq mechanization with a separation-logic proof of robust capability safety.
  • V8 Sandbox / Heap Sandbox. Intra-process software isolation: pointers within the V8 heap replaced by 32-bit offsets (4 GiB cage); JIT metadata in Trusted Space behind a second indirection; code pointers indexed through a Code Pointer Sandbox table. Default in Chrome 2024; ~1% overhead; exploitation now requires a sandbox-escape primitive on top of the original UAF or OOB.

vm3's handle Cell is in this family. The 4-bit arena tag is the "type" half of a kalloc_type / xzone allocator. The 12-bit generation is an MTE-style tag. The 32-bit slab index is an MSWasm offset. The 8-byte total fits the JSC NaN-box / Hermes / WasmGC i31ref envelope.

2. Memory tagging in production: MTE and Apple MIE

  • Arm MTE (Armv8.5). 4-bit hardware tag per 16-byte granule; pointer-side 4-bit key in top byte; SYNC / ASYNC / asymmetric modes. Pixel 8 was first user-toggleable; Android 13+ supports.
  • Apple MIE on iPhone 17 / A19 (Sept 9 2025). Enhanced MTE (EMTE) + typed allocators (kalloc_type from iOS 15, xzone malloc from iOS 17) + Tag Confidentiality Enforcement (TCE, an explicit defense against Spectre-style tag leakage). Always-on synchronous mode kernel-wide and across 70+ userland processes. Credited by Apple with breaking essentially every recent mercenary-spyware exploit chain. The largest single deployment of hardware-assisted memory safety in history.

The MIE lesson worth stealing: never let user code observe a raw generation number, because generation churn becomes a side channel. Treat generations as opaque, attacker-unobservable secrets enforced by the verifier.

3. Ownership and reference modes

  • Rust + Polonius (alpha, Rust 2025h2 goal, stabilization 2026). "Origins as sets of loans" replaces NLL's "lifetimes as sets of points." 10-20% compile-time cost budget, closes NLL problem case #3, enables lending iterators.
  • Vale generational references. The intellectual ancestor of vm3. Per-allocation currentGeneration: u64; per-reference rememberedGen: u64; assert(curr == remembered) on every deref. Measured 2-10.84% overhead on BenchmarkRL. Vale itself is paused (Ovadia spent 18 months at Modular on Mojo, now exploring group borrowing). The idea travelled.
  • Hylo (formerly Val). Mutable value semantics. Subscripts (a[i] yields a projection rather than returning) with variants let / inout / sink / set.
  • Mojo (1.0 targeted late summer 2026). Argument conventions read / mut / var with ^ transfer. Lifetimes as compiler-internal origins (ImmutOrigin, MutOrigin, MutExternalOrigin, MutAnyOrigin). ASAP destruction.
  • Austral. Pure linear types (not affine) + capability-based effects. Linear checker is ~600 lines of OCaml.
  • OxCaml (Jane Street, open-sourced July 2025). Modal type system with orthogonal axes: locality (production at JS), uniqueness (recently merged), linearity, yielding, planned sync. OxCaml's modes-as-orthogonal-axes is the cleanest way to layer optional restrictions on a permissive default.
  • Swift ~Copyable, borrowing/consuming/inout. SE-0377 + SE-0390 (5.9, late 2023); SE-0432 added pattern matching on noncopyable enums (WWDC24).
  • Pony six-capability matrix. iso/trn/ref/val/box/tag paired local/global guarantees. Only iso/val/tag are sendable. Data-race freedom falls out of the matrix.
  • Project Verona regions + cowns + behaviours. OOPSLA 2024 "Reference Capabilities for Flexible Memory Management" plus "Behaviour-Oriented Concurrency." PLDI 2025 "Dynamic Region Ownership for Concurrency Safety" co-authored with CPython free-threading core contributors.
  • Inko single-owned references. Lightweight runtime RC, no GC, no static borrow checker; stale reference => deterministic panic.

MEP-41 borrows Swift's vocabulary (borrow / consume / inout) plus a fourth keyword weak that explicitly forces non-elision (the user telling the verifier "I expect this to dangle, keep the check").

4. Runtime techniques

  • Perceus + Roc + Lobster. Precise compile-time-inserted RC with reuse analysis: pattern-matching deconstruction reuses the freed cell for the next allocation. Koka v3.1.3 (July 2025). ICFP'25 distinguished paper "First-order Laziness" (Lorenzen et al.) extends FIP to lazy thunks.
  • Generational ZGC (JDK 21+, GA in 21 LTS). Sub-millisecond pause times at multi-TB heap sizes with a generational extension.
  • MMTk LXR. Modular memory toolkit research; arena-with-policy beats monolithic generational on bytecode-VM workloads.
  • JSC Riptide. WebKit JS engine collector; constraint-based marking with weak-handle support.
  • V8 Oilpan / Orinoco. Cross-component tracing with Blink integration.
  • WasmGC (Wasm 3.0, 2024). Typed struct, array, i31ref standardize the handle-based reference into a managed heap.
  • MarkUs (IEEE S&P 2020). Allocator-side UAF defense by quarantining freed objects until no reference remains pointing into them. Production-suitable overhead.
  • MiraclePtr / BackupRefPtr (Chrome 2022). UAF mitigation via per-allocation atomic ref count attached to PartitionAlloc; non-aliased pointers stay raw, suspect pointers (those crossing trust boundaries) are checked.
  • Scudo (LLVM hardened allocator). Per-class freelists, quarantine, randomized metadata; ships in Android since 11; basis for GWP-ASan sampling fault injection.

vm3 inherits Vale's generation discipline by construction. MEP-41 adds quarantine on free: a slot whose generation has wrapped past 4096 reuses is held out of its free-list for one full GC cycle, eliminating the ABA window in long-running programs (REPL, language server).

5. JIT memory-safety hardening

  • W^X discipline. macOS arm64 requires MAP_JIT + pthread_jit_write_protect_np() toggling; Linux amd64 uses dual-mapping or per-thread W^X switching.
  • PAC / BTI (Arm). Pointer authentication on return addresses and indirect calls; branch target identification refuses any indirect branch to a non-bti j/c instruction.
  • Intel CET. Shadow stack (ROP defense) + Indirect Branch Tracking (JOP defense). Tiger Lake (2020) and later.
  • Spectre v1 index masking. Every JIT-generated array index is masked against the array length before the bounds check, blocking speculative out-of-bounds reads.
  • V8 Site Isolation + cross-origin read blocking. Per-origin renderer processes break Spectre cross-origin reads.

vm3jit must ship with W^X by default, PAC + BTI on darwin/arm64, CET where available, and index masking on every typed-array opcode lowering. This is the §7 hardening plan.

6. Industry / policy

  • Microsoft "70% of CVEs are memory-safety" (BlueHat IL 2019). The canonical statistic that motivated the policy wave. Re-confirmed by Google's analysis of Android (50-70% pre-Rust adoption).
  • Google / Android Rust (2024 crossover). New Android memory-safety bugs dropped below 20% of total CVEs in 2024 as new code shifted to Rust + Kotlin.
  • CISA Secure-by-Design Pledge (May 2024; January 1 2026 memory-safety roadmap deadline). ~300 signers as of May 2026. Voluntary, no enforcement; political signal unambiguous.
  • NSA CSI "Software Memory Safety" (Nov 2022, updated June 2025). Recommends C#, Go, Java, Ruby, Rust, Swift. Mochi's positioning piggy-backs on this list.
  • ONCD "Back to the Building Blocks: A Path Toward Secure and Measurable Software" (Feb 2024). White House urging memory-safe languages.
  • DoD SWFT (Software Fast Track, 2025). Rewards Secure-by-Design alignment with faster security authorization.
  • EU Cyber Resilience Act (reporting begins Sept 2026). Manufacturer obligations for in-scope products.

Mochi's position in the taxonomy

The provenance chain for Mochi memory safety is exactly the chain Python, Java, and JavaScript use:

Mochi source ─► statically type-checked (MEP-4, 5, 6)
─► compiler3 (MEP-40 §7) emits verified bytecode
─► runtime/vm3 (MEP-40) executes via typed-arena
handles with per-deref generation check
─► runtime is Go (CISA-named memory-safe)
─► Go runtime is taken on the same footing as
JVM for Java or CPython for Python

The structural property:

  • No use-after-free. Generation mismatch traps at the deref site (runtime/vm3/accessors.go). Equivalent to Vale's guarantee.
  • No type confusion across container kinds. 12 arenas, 4-bit tag, dispatch happens before memory touch. Equivalent to MIE's kalloc_type / xzone.
  • No null dereference. MEP-16 Option<T> and ?. / ?? discipline; force-unwrap is not a Mochi operator.
  • No out-of-bounds container access. Handles encode arena tag + index; index is bounds-checked at arena[idx] lookup; container payloads (e.g. vmList.cells) are length-checked at the lookup step. Equivalent to MSWasm segment-memory bounds.
  • No code injection in the JIT. §7 hardening: W^X, PAC, BTI, CET, index masking.
  • No FFI escape. §6.7 sealed handles; FFI receives sealed wrapper, cannot deref without unseal opcode gated on MEP-15 meta effect.

What we do not claim, and the alternatives that do:

  • We do not claim data-race freedom (Rust does; Pony does; Verona does). Single-VM execution sidesteps the question.
  • We do not claim compile-time aliasing-XOR-mutation (Rust does). Borrow modes in §6.9 are advisory: violation is a runtime panic, not a compile-time error.
  • We do not claim formal mechanized proof (Iris-MSWasm has it for MSWasm; RustBelt has it for a Rust subset). §10 open question.

§6 Architecture

6.1 Threat model

The bytecode is trusted once it passes the verifier. The verifier is trusted code in runtime/vm3/verify.go. The Go runtime is trusted on the same footing as CPython for Python. Everything else is untrusted:

  • Untrusted source program. Parsed and type-checked. If it parses and type-checks, it is "syntactically well-formed Mochi" but is not yet trusted.
  • Untrusted bytecode (e.g. loaded from disk). Must pass the verifier before any opcode executes. Verifier rules in §6.2.
  • Untrusted external data. Constructed only through verifier-checked constructors. Numeric parsing returns Option<T>; container construction returns a fresh handle into the appropriate arena.
  • Untrusted FFI. Receives sealed handles only; cannot deref. §6.7.
  • Untrusted JIT input. Lowered from verified bytecode only. JIT emit refuses any IR node not derived from a verifier-accepted opcode. §7.

The boundary between trusted and untrusted is the verifier. The verifier is the single point of memory-safety policy.

6.2 Verifier rules

Five rule classes. Each is a static check over the compiler3 IR or the emitted bytecode; no rule depends on runtime state.

Class A: Handle origin. Every Cell produced by any opcode is either (a) a copy of an existing Cell, (b) the result of an alloc.go constructor (AllocString, AllocList, ...), or (c) an inline value (small int, float, bool, null, deopt sentinel). No opcode may compute a handle from arbitrary bits. The MakeHandle(tag, gen, idx) function in runtime/vm3/cell.go is package-internal and unreachable from any compiler3-emitted opcode.

Class B: Tag stability. No opcode rewrites the arena tag of a Cell in place. Sealing is the only operation that produces a Cell with a different tag from its input; it is implemented as a fresh-Cell constructor, not an in-place tag flip. This makes "tag is stable for the lifetime of a value" an invariant the JIT can rely on.

Class C: Generation opacity. No opcode exposes the generation field of a Cell to user bytecode. There is no gen_of(handle): int operator; no cell_to_bits(handle): u64; no arithmetic on Cell values. The only thing user code can do with a handle is dereference it (which traps on stale gen and yields the payload) or copy it (which preserves the gen). This is Mochi's analog of Apple TCE.

Class D: Arena tag dispatch. Every opcode that dereferences a handle dispatches on the arena tag before touching the slab. OpListGet traps if its handle is not an arenaList tag, before doing any index arithmetic. The 12-arena partition becomes a memory-safety property, not just a performance one.

Class E: Reference-mode discipline. The optional borrow / consume / inout / weak annotations carry verifier obligations (§6.9). Violation is a runtime panic (the runtime trips because the elided gen-check would have caught it), not a compile-time error. The verifier records elision sites; the runtime can selectively re-enable checks (debug builds, attack triage) by ignoring elision metadata.

These rules are pure additions. No existing MEP-40 opcode needs to change. The verifier becomes a new pass in compiler3 between SSA lowering and emit.

6.3 Generation as TCE-protected secret

Tag Confidentiality Enforcement in Apple MIE is the mitigation against Spectre-style attacks that leak the per-granule tag value through speculative execution side channels. The analog in vm3:

  • The verifier refuses any opcode that lets user bytecode observe gen. There is no weakref_gen(w): u64.
  • The verifier refuses any opcode that lets user bytecode compute differences between generations. There is no operator whose semantics depends on "the gen of handle A versus the gen of handle B."
  • The JIT (§7) does not load generation into a register that can be read by user-visible computation. The gen check is emitted as a single compare-and-branch into a trap stub; the loaded value never reaches a user-named SSA value.

Without these three rules, an attacker who can run untrusted Mochi bytecode could in principle observe generation churn (e.g., by timing how soon a stale handle traps after a sibling allocation in the same arena) and use it to mount an ABA attack against generation wrap. With the rules, generation is an opaque runtime secret.

6.4 Typed arenas as a memory-safety property

MEP-40 §6.2 partitions memory into 12 typed arenas for performance reasons (Go GC can scan each slab's pointers without ambiguity; allocator class is decided by static type). MEP-41 names this partition as a memory-safety property:

  • Type confusion across arena kinds is structurally impossible. OpStructFieldGet on a Cell whose arena tag is arenaList traps before any field offset is computed.
  • ABA on generation wrap is bounded by quarantine (§6.5). Even after generation wrap, a slot cannot be reused as a different arena kind.
  • The arena tag in the handle is a capability in the CHERI sense: it grants the right to invoke that arena's accessors and nothing else. Sealing (§6.7) is the mechanism that removes this capability for the duration of a foreign call.

This is the same property kalloc_type / xzone gives Apple kernel memory: even on a use-after-free, the reused slot is the same type as the freed slot, so the attacker cannot pivot the UAF into a type-confusion primitive.

6.5 Quarantine and delayed reuse

MEP-40 §9.2 specifies free-list reuse on alloc: a swept slot is pushed to its arena's free-list and its gen is bumped. MEP-41 adds quarantine on top:

  • A slot whose gen has wrapped past 4096 (one full 12-bit rotation) is held in a per-arena quarantine list for one full GC cycle before re-entering the free-list. This bounds the ABA window. Inspired by MarkUs (IEEE S&P 2020) and Scudo's quarantine.
  • The quarantine threshold is tunable. Long-running programs (REPL, language server) get full quarantine; short benchmark runs (single Run() call) can skip it.
  • Quarantine adds a per-arena bookkeeping cost proportional to the wrap rate. In practice the wrap rate is very low for any single arena (Vale's measured 2-10% overhead is at much higher allocation pressure than vm3 typically sees on bytecode workloads).

6.6 Guard slabs (optional, Phase 3)

For high-assurance deployments, MEP-41 specifies an optional guard-slab mode: every allocation is in its own page, with unmapped pages on either side. Spatial overflow within a list / map / string traps immediately as a Go runtime fault (page protection violation), independent of the verifier's bounds check.

This is the same idea as Electric Fence and GWP-ASan. The cost is one VM page per object (typically 16 KiB on darwin/arm64), which is too expensive for production but right-sized for fuzzing and security regression testing. Hidden behind a -vm3-guard-slabs flag.

6.7 Sealed handles for FFI

CHERIoT introduced sealed capabilities: a capability tagged with a key, which can only be unsealed by the matching key. Sealed capabilities cannot be dereferenced; they are opaque tokens.

vm3 inherits the construction directly. Two opcodes:

  • OpSeal(handle, key: u32) -> sealed_handle. Produces a fresh Cell with arena tag = arenaSealed, payload = the original handle bits XORed with the key (or, more conservatively, indexed into a per-VM sealing table). The original handle cannot be recovered from the sealed wrapper without the key.
  • OpUnseal(sealed_handle, key: u32) -> handle. Inverts OpSeal. Traps if the key does not match.

FFI receives sealed handles only. Mochi user code can pass OpSeal(h, k) to a Go FFI function; the Go function can store it, return it, pass it around, but cannot dereference it. When the FFI function returns and Mochi code calls OpUnseal(s, k), the handle is recovered.

OpUnseal is gated on MEP-15's meta effect. Functions that unseal a foreign-returned handle are flagged as effectful and cannot run in a pure context (query predicates, compile-time constant folding). This mirrors Austral's capability-passing effect discipline and prevents accidental laundering of capabilities through pure-looking interfaces.

6.8 Trusted Space for JIT and bytecode metadata

V8's Trusted Space holds bytecode containers, JIT-compiled code metadata, and other data structures whose corruption would break the integrity of the sandbox itself. They are reached via indirection from inside the sandboxed heap but live outside it.

The vm3 analog:

  • The 12 arena slabs are the sandbox.
  • The JIT code page, the bytecode constant pool, the function-table dispatch array, and the import-table for foreign symbols are Trusted Space.
  • The JIT code page is W^X (§7). The bytecode constant pool is read-only after load; the verifier rejects any opcode that writes into the constant pool.
  • The function-table dispatch array is indexed only by verifier-checked function IDs. Indirect calls go through this table; raw code pointers never appear in a Cell.

Trusted Space in vm3 is structurally separate from arena memory because it is in different Go allocation lifetimes. The verifier and the JIT emit pass are responsible for keeping this separation tight.

6.9 Reference modes: consume, borrow, inout, weak

Optional mode annotations on bindings and function parameters. Each is a verifier check, not a bytecode change. Each maps to an optimization opportunity or a strictness opportunity.

consume x: T (Swift consuming, Mojo var ... ^, Hylo sink). The binding owns its value and the function may destroy it. If the value is heap-allocated, consume enables an immediate gc.kill at the end of the binding's last use (deterministic free), bumping the generation and pushing the slot to the free-list without waiting for the next GC cycle. Verifier obligation: the consumed binding is not used after its last consume-point.

borrow x: T (Swift borrowing, Mojo read, Hylo let). The binding has read-only access to its value for the scope of the borrow. Within the borrow scope, the verifier may elide the generation check on every deref of x, because the borrow scope's owner is structurally pinning the value. Verifier obligation: no opcode in the borrow scope can cause the borrowed value to be freed (no GC point that could see it as unreachable; no escape into a destination that could outlive the borrow).

inout x: T (Swift inout, Mojo mut). The binding has read-write access for the scope, and the modified value is written back at scope exit. Same gen-check elision as borrow. Same verifier obligation, plus exclusivity (no other live alias may exist).

weak x: T (Vale-inspired). The binding is forced non-elided. The verifier records that even if escape analysis would let it elide the gen check on x, it must not. This is the user telling the verifier "I expect this to dangle, keep the safety check." Useful for observer patterns (Vale's canonical example).

Reference modes are an opt-in layer. Default-mode Mochi has no annotations and pays the full gen-check cost (the 2-10% Vale overhead). Annotated Mochi can match Rust-like zero-overhead access in the borrow scope.

Reference modes are not enforced as type errors. A borrow that escapes its scope produces a runtime trap (the gen check that wasn't elided in a non-borrow caller catches it), not a compile-time error. Static borrow checking is deferred to a future MEP.

6.10 Effect and option interop

MEP-15 (Effects) and MEP-16 (Null Safety) are co-load-bearing with MEP-41.

MEP-15 hooks.

  • OpUnseal is gated on the meta effect.
  • gc.kill(x) (force-free a handle, deterministic destruction) is gated on the meta effect.
  • A "may-trap-on-stale-handle" implicit effect can be added; purity contexts (query predicates, compile-time constant folding) treat this implicit effect the way they treat io today. This is a generalization MEP-15 left open and MEP-41 motivates.

MEP-16 hooks.

  • try_deref(h: Handle<T>): Option<T> surfaces a gen-check failure as none instead of a trap. This is the same no-force-unwrap discipline of MEP-16 extended one layer down to the runtime.
  • ?. and ?? operators flow through try_deref cleanly. A chained h?.field is "if h is live, return its field; otherwise none."

These three operations (gc.kill, sealed handles, try_deref) are the entire user-visible surface MEP-41 adds. Everything else is verifier rules and runtime invariants.

§7 JIT memory-safety hardening

vm3jit (MEP-40 Phase 5) lowers verified bytecode to native code. The JIT code page is the most attractive target in the runtime: corrupting it means escaping every memory-safety property the rest of MEP-41 enforces. MEP-41 specifies the hardening plan.

7.1 W^X on the JIT code page

The JIT code page is never simultaneously writable and executable.

  • darwin/arm64. Use mmap with MAP_JIT to obtain a single mapping that can toggle between W and X via pthread_jit_write_protect_np(int enabled). The toggle is per-thread; vm3jit code-emit calls pthread_jit_write_protect_np(0) before writes and pthread_jit_write_protect_np(1) immediately after, then sys_icache_invalidate to flush the I-cache for the patched range.
  • linux/amd64. Use dual mapping: one writable mapping and one executable mapping over the same memory backing (via memfd_create + two mmap calls). vm3jit writes to the writable mapping; execution uses the executable mapping. Alternative: per-thread mprotect toggling, which is cheaper but blocks concurrent emit.
  • Verification. A debug-mode assert in the JIT emit path checks that no write goes to a page currently mapped executable in the calling thread.

7.2 PAC and BTI on arm64

  • PAC. Pointer Authentication Code. Every JIT-emitted function entry signs the return address with paciasp; every return uses retaa (autia + ret). This forecloses JOP attacks that corrupt return addresses.
  • BTI. Branch Target Identification. Every indirect-branch target in JIT-emitted code starts with a bti j (jump) or bti c (call) instruction. The CPU refuses any indirect branch to a non-bti instruction. This forecloses JOP gadget-chaining.
  • Code page permissions. The JIT code page is marked PROT_BTI (Linux) or relies on the macOS default of BTI-aware execution (Apple Silicon enables BTI for MAP_JIT pages by default).

7.3 Intel CET on amd64

  • Shadow stack (SHSTK). Hardware-maintained second stack of return addresses. Every ret is checked against the shadow-stack top. ROP gadgets that overwrite the return address on the user stack fail at the next return. Enabled per-thread via arch_prctl(ARCH_SHSTK_ENABLE).
  • Indirect Branch Tracking (IBT). Every indirect-branch target must begin with endbr64; the CPU traps on indirect branches to non-endbr targets. vm3jit emits endbr64 at every dispatch entry. Enabled per-thread via arch_prctl(ARCH_CET_ENABLE_IBT).
  • Fallback. On CPUs without CET, the W^X discipline of §7.1 remains.

7.4 Spectre v1 index masking

Every typed-array opcode lowering (OpListGetI64, OpListGetF64, OpStringIndex, OpBytesIndex, ...) emits an index mask before the bounds check:

masked_idx = idx & (length_pow2 - 1) // when length is a power of two
masked_idx = mux(idx >= length, 0, idx) // general case
load value from base + masked_idx * elem_size

The mask forecloses speculative loads at out-of-bounds offsets, which would otherwise leak through cache-timing side channels. This is the standard V8 / JSC mitigation since 2018.

7.5 Indirect-branch hardening

  • Retpolines (amd64, where CET IBT is unavailable). Indirect jumps lowered as call __retpoline_thunk; lfence. Mitigates Spectre v2 (BTI poisoning).
  • AAarch64. Speculative-branch predictor barrier (SB, Armv8.5) before high-risk indirect branches.
  • Verification. All vm3jit indirect branches in the dispatch loop are either CET-IBT-checked, BTI-checked, or retpoline-wrapped. The lowering pass refuses to emit a bare indirect-jump on a JIT code path that handles untrusted user data.

7.6 Guard pages around the code page

The JIT code page is bracketed by unmapped guard pages on both sides. Off-by-one writes during emit fault immediately rather than silently corrupting an adjacent allocation. This is cheap (two pages of address space per code chunk) and is standard practice in V8 and JSC.

§8 Phased plan with gates

Each phase has a deliverable, a gate, and an exit criterion. Phases are sized to ship one PR at a time. Most phases are independent of MEP-40 phases; the few dependencies are noted.

Phase 0: Spec freeze and threat-model document (week 1)

Deliverables:

  • This MEP merged.
  • docs/security/threat-model.md enumerating trust boundaries (§6.1).
  • docs/security/memory-safety.md skeleton (publicly visible page; final wording shipped in Phase 7).

Gate: spec merged; threat model passes peer review by at least one external Mochi contributor.

Exit: spec frozen, threat-model document checked in.

Phase 1: Verifier rules (weeks 2-4)

Deliverables:

  • runtime/vm3/verify.go: implements rule classes A through D from §6.2. Rule class E is Phase 4.
  • compiler3 invokes the verifier between SSA lowering and emit; verification failure is a compile-time error.
  • Fuzz harness that generates random opcode sequences and checks the verifier rejects ill-formed ones.

Gate: every existing test in runtime/vm3/ and compiler3/ passes with the verifier wired in. Fuzz harness runs 1M opcode sequences without crashes or unexpected acceptances.

Exit: verifier is mandatory in compiler3; bytecode that doesn't pass cannot reach the VM.

Phase 2: Generation opacity (weeks 5-6)

Deliverables:

  • Verifier rule class C wired (no opcode exposes gen to user bytecode).
  • Audit of every existing vm3 opcode for compliance.
  • Code review of runtime/vm3/cell.go and accessors.go to confirm gen never reaches a user-visible SSA value.
  • JIT lowering audit (Phase 5 of MEP-40 deliverable): the gen-check sequence does not produce a register that user-named SSA values can read.

Gate: no opcode in the MEP-40 corpus exposes gen. JIT-lowered code holds the gen value only in compiler-internal temporaries that are never spilled to a named slot.

Exit: generation is structurally opaque end-to-end.

Phase 3: Quarantine, guard slabs, sealed handles (weeks 7-10)

Deliverables:

  • runtime/vm3/gc.go (extending MEP-40 Phase 6): quarantine list per arena; wrap-detection logic; configurable threshold.
  • -vm3-guard-slabs build flag wiring page-per-object allocation; fuzzing target uses this mode.
  • OpSeal / OpUnseal opcodes in runtime/vm3/op.go; per-VM sealing table; verifier rule that OpUnseal requires MEP-15 meta effect.
  • Test: round-trip OpSeal / OpUnseal with matching key returns same handle; mismatched key traps; sealed handle cannot be dereferenced.

Gate: long-running test (1M REPL evaluations) under guard-slab mode runs without spurious traps; quarantine bounds memory under repeated allocation cycles; sealed-handle FFI round-trip works against a representative Go stdlib call (e.g. os.Open returning a sealed file-handle wrapper).

Exit: ABA window bounded; FFI safe by default.

Phase 4: Reference modes (weeks 11-14)

Deliverables:

  • Grammar additions: consume / borrow / inout / weak as binding and parameter modifiers.
  • Type-checker support: modes propagate through SSA; verifier rule class E records elision sites.
  • compiler3 elision pass: in borrow and inout scopes, generation checks on the borrowed value are elided.
  • gc.kill(x) builtin in MEP-15 meta effect.
  • Surface tests: a borrow-annotated tight loop measures faster than its unannotated counterpart by at least the Vale overhead (2-10%).

Gate: corpus runs unchanged under default mode; opted-in borrow examples measure within 5% of equivalent Rust-style zero-overhead access.

Exit: reference modes shipped as an optional, opt-in surface.

Phase 5: vm3jit hardening (weeks 15-18, runs alongside MEP-40 Phase 5)

Deliverables:

  • W^X on the JIT code page for darwin/arm64 (MAP_JIT + pthread_jit_write_protect_np) and linux/amd64 (dual mapping).
  • PAC + BTI on every JIT function entry (darwin/arm64).
  • CET shadow-stack + IBT enablement on amd64 where the kernel exposes it.
  • Spectre v1 index masking on every typed-array opcode lowering.
  • Indirect-branch hardening: CET IBT, BTI, or retpoline at every JIT indirect branch.
  • Debug-mode self-test: a hand-crafted JIT-side ROP gadget chain attempt is blocked by PAC / shadow stack.

Gate: vm3jit benchmarks regress by less than 2% versus the unhardened baseline; the self-test ROP attempt fails on hardware that supports the relevant feature.

Exit: vm3jit code-page integrity matches V8 / JSC hardening posture.

Phase 6: Audit, fuzz, and documentation (weeks 19-21)

Deliverables:

  • Internal audit: walk every runtime/vm3 and compiler3 file, cross-reference with the §6 architecture, document gaps.
  • External fuzz target: a go-fuzz or oss-fuzz harness against compiler3 + runtime/vm3 with the verifier wired in.
  • 24-hour fuzz run with zero new crashes.
  • Update docs/security/memory-safety.md with measured numbers (per-arena allocation costs, gen-check overhead, JIT-hardening overhead).

Gate: zero new crashes in 24-hour fuzz; documentation is complete and reviewer-approved.

Exit: ready to publish.

Phase 7: Public statement and CISA-roadmap alignment (week 22)

Deliverables:

  • docs/security/memory-safety.md (final wording). Single page, references MEP-41 §6, NSA CSI, CISA Pledge documentation, ONCD memo, Apple MIE documentation.
  • Blog post on mochi-lang.org announcing the statement.
  • SECURITY.md at repo root pointing to the statement.
  • A reference template that downstream organizations (CISA pledge signers) can paste into their internal roadmap documents to cite Mochi adoption as roadmap-aligned.

Gate: statement reviewed by at least two contributors and one external reviewer; blog post published.

Exit: Mochi has a public, defensible memory-safety position.

§9 Performance model

vm3 already pays for the runtime cost of memory safety. MEP-41 quantifies it.

9.1 Generation check cost (baseline, no elision)

Per deref in the interpreter: one slab-load + one compare + one branch. Estimated 2-3 ns on Apple M4 at 4.4 GHz, dominated by L1 access. Vale measured 2-10.84% overhead on a terrain-generation workload; vm3 is a closer match for the low end of that range because its handles are typically held in typed register banks (Frame.regsCell, MEP-40 §6.3) where the slab-load benefits from spatial locality.

9.2 Reference-mode elision savings

A borrow-annotated tight loop elides the gen check for every read of the borrowed value. On a list-iteration micro-benchmark (10M borrow l: list[i64]; for x in l { sum += x } iterations), the predicted speedup is 5-8%. Matches Vale's measured region-borrow speedup on the same shape of code.

9.3 JIT-hardening overhead

  • W^X toggle: ~50-100 ns per JIT emit batch on darwin/arm64; amortized over many opcodes per batch.
  • PAC sign/auth per function entry/exit: ~1 ns each on Apple M4.
  • BTI per indirect branch: zero (hardware-checked at fetch).
  • CET shadow-stack per return: ~1 ns on Tiger Lake and later.
  • Spectre v1 index masking: 1-2 cycles per array access; net effect on a hot fillsum loop is under 2%.

Total predicted JIT-hardening overhead on the BG benchmark suite (MEP-40 §8.2): under 2% on every benchmark.

9.4 Quarantine memory overhead

Quarantine holds a slot's metadata (16 bytes per slot: index, arena tag, gen counter, sweep-cycle stamp) for one GC cycle after wrap. For an arena with N slots and wrap rate W per cycle, quarantine costs O(N + W) metadata. On any realistic workload N dominates and W is negligible.

§10 Open questions

  1. Should generations be widened from 12 to 16 or 24 bits? MEP-40 §10 already raises this. From a memory-safety standpoint, 16-bit gen with quarantine is overwhelmingly sufficient. 24-bit would foreclose ABA entirely without quarantine but tightens the Cell layout. Recommendation: 16-bit with mandatory quarantine, decision deferred to MEP-40 Phase 6 measurement.

  2. Should weak<T> be a first-class type, or only a binding mode? Vale exposes weak references as a type. Mochi could ship Weak<T> parallel to Option<T>. Pros: explicit observer pattern, clean interop with MEP-16. Cons: more surface area; the binding-mode approach already covers the use cases.

  3. Should MEP-41 commit to a future Iris mechanization? Iris-MSWasm exists (Coq, full mechanization of MSWasm 1.0 with separation-logic proof of robust capability safety, OOPSLA 2024). The same approach is applicable to vm3 because the structural property is the same. Cost: a multi-year academic collaboration. Benefit: the only memory-safe VM with a mechanized proof. Recommendation: open the conversation with one of the Iris-MSWasm authors; defer commitment until after Phase 7.

  4. How does MEP-41 evolve when Mochi gets concurrency? Single-VM execution is the load-bearing assumption that lets us punt on data races. Adding concurrent Mochi (multi-actor, multi-thread) will need either Pony's six-cap matrix or Verona's region-cown-behaviour model. Neither retrofits cleanly onto vm3's handle ABI; the handle would need a third axis (which actor owns this?) or the runtime would need per-region arenas. Defer to the concurrency MEP.

  5. Should MEP-41 reserve a clause for US-federal deployment scenarios? CISA / NSA / OMB guidance evolves. Should MEP-41 commit that Mochi will track post-January-2026 updates? A loose commitment is cheap and defensible; a tight one risks promising to chase regulations whose final shape isn't visible yet. Recommendation: loose commitment in docs/security/memory-safety.md, no binding language in this MEP.

  6. Should sealed handles be the default across the FFI boundary, with raw passing as an opt-in? Pros: safer by default. Cons: every existing FFI call site needs an unseal wrapper. Recommendation: yes, sealed by default; ship a --unsafe-ffi per-call opt-out for migration.

  7. Should gc.kill(x) exist at all? Pros: deterministic destruction for resources that hold finalizers (file handles, network sockets). Cons: a foot-gun (kill a handle still in use, get a deterministic panic but a panic nonetheless). Recommendation: yes, gated on MEP-15 meta effect, with a strong documentation note that the verifier does not statically prove the killed handle is the last reference.

  8. What is the right wording for the public statement? Draft: "Mochi is designed to enable signatories of the CISA Secure-by-Design Pledge to use it as part of their memory-safety roadmap, equivalent to selecting any other named memory-safe language. Mochi's runtime (vm3) implements per-handle generation checks (no use-after-free), typed-arena allocation (no cross-type confusion), null-safe option types (no null dereference, see MEP-16), and JIT-side W^X plus PAC/BTI/CET hardening (no code injection). The runtime is implemented in Go, a CISA-named memory-safe language."

§11 Risks

11.1 Reference modes are advisory, not enforced

A borrow-annotated value that actually escapes its scope produces a runtime panic, not a compile-time error. Pros: keeps the verifier simple, mirrors Vale and Inko. Cons: users who annotate borrow and assume Rust-style static safety will be surprised. Mitigation: documentation is explicit about the runtime-checked nature of modes; opt-in mode means default-mode programs are unaffected.

11.2 Sealed handles add a verifier obligation around FFI

Every FFI call site needs an unseal wrapper. Pros: safe by default. Cons: every existing call site needs updating. Mitigation: ship --unsafe-ffi opt-out (see §10 open question); migrate in a follow-up PR per call site.

11.3 JIT hardening may regress benchmarks

§7 hardening adds overhead. Predicted under 2% on BG; if measurement shows more, the gate at Phase 5 will fail. Mitigation: each hardening axis is independently toggleable, so we can isolate which one regresses and revisit.

11.4 Generation opacity may conflict with debugger / profiler tooling

A debugger that wants to inspect a stale handle and report "this handle was freed at cycle N, current gen is M" needs access to the gen value the verifier hides from user bytecode. Mitigation: debug-mode verifier rule relaxation (rule class C is suppressed in debug builds); production-mode keeps it strict.

11.5 Quarantine adds per-arena bookkeeping cost

A long-running workload with high wrap rate (REPL with allocation-heavy expressions) pays for quarantine metadata. Mitigation: quarantine threshold is tunable; bench harness uses zero quarantine; production-mode uses full quarantine.

11.6 Public statement may overclaim

Saying "memory-safe equivalent to Rust" is wrong. Saying "memory-safe equivalent to Java" is defensible. The statement language must be careful. Mitigation: §10.8 draft is conservative; review by external Mochi contributors and at least one security researcher before Phase 7 publishes.

11.7 Apple TCE analog is incomplete

We hide gen from user bytecode and from JIT-named SSA values. We do not mitigate against Spectre side-channel attacks that observe gen-check timing externally (e.g., from a sibling process via shared cache). Mitigation: document the gap; treat MIE-style hardware tag confidentiality as the future target when Mochi runs on MIE-enabled hardware.

§12 References

The full research substrate lives in ~/notes/Spec/5400/ (57 deep-dive files plus one overview). Each file carries a §1 Provenance section with canonical URLs. The most load-bearing citations for this MEP are:

Capability machines and software handle ABIs

Memory tagging in production

Generational references and ownership

Runtime techniques and UAF defenses

JIT hardening

Industry / policy

Mochi cross-references

  • MEP-15 (Effects): provides meta effect gating for gc.kill and OpUnseal.
  • MEP-16 (Null Safety): provides Option<T> and the no-force-unwrap discipline that try_deref extends.
  • MEP-40 (vm3 + compiler3): provides the handle Cell, the 12 typed arenas, the generation field, and the verifier hook point that MEP-41 fills.

§13 Workflow note (for implementers)

The MEP-39 standing rule applies: every win must be a generic VM improvement, not a single-purpose super-op. Verifier rules in MEP-41 are generic by construction; the same rule applies to every program. Reference modes are generic syntax; they apply wherever the user writes them. JIT hardening is generic; it applies to every JIT-emitted instruction sequence.

Every phase deliverable is one PR (or a small number of PRs) gated by the named criterion. No phase ships until its gate is green. The MEP file is updated with measured results at each phase boundary (same discipline as MEP-37 / MEP-38 / MEP-39 / MEP-40).

The MEP and the code ship in the same PR. A change to the verifier without a corresponding spec update is rejected by review; a spec change without test coverage in the same PR is rejected by review. This is the MEP-spec-in-sync rule.

The public statement in Phase 7 is the single user-visible artifact of this MEP. Its wording matters more than any internal architecture choice. Treat its review as load-bearing; rotate at least one external reviewer through it before publication.