May 2026 (v0.11.0)
This is the v0.11.0 cut. It is the largest release Mochi has made to date. Three things shipped at the same time and each one would have been the headline on its own: a new runtime and JIT (MEP-40) that catches Go on most kernels and beats every interpreter peer, a null safety pass (MEP-16) that retypes options end-to-end across the type checker, and a soundness closeout (MEPs 5, 7, 10, 11, 12, 13, 14, 15) that turns long-standing draft work into shipped behavior with fixtures and error codes. Three new draft MEPs (41, 42, 43) open the next round.
Source compatibility is preserved except for one rename: the literal
null is now spelled none. There is a one-paragraph migration note
at the bottom.
1. Performance: MEP-40 vm3 + compiler3 + vm3jit
The vm2 + compiler2 stack is replaced by a new triple: runtime/vm3,
compiler3, and runtime/jit/vm3jit. The redesign was done from first
principles. It rests on three changes:
- 8-byte handle Cell. vm2 used a 16-byte split Cell with a
Bits uint64plus anObj unsafe.Pointerfield. vm3 collapses that to a singleuint64where the bottom bits are a typed handle into a per-type Go-allocated arena slab. Heap residency is half on every workload that touches lists, maps, or strings. Go's GC still owns reclamation because handles are integer indices into slabs, not pointers. - Three typed register banks. Each frame carries
regsI64,regsF64, andregsCellinstead of one homogeneous[]Cell. The compiler picks the bank at SSA construction time. The interpreter never reads a runtime tag in the hot path, and the JIT never emits a type test. - Static-type-driven dispatch. compiler3 lowers Mochi's static
type system end-to-end so every bytecode op has its operand banks
baked into the opcode. There are no
OpAddpolymorphic dispatches.OpAddI64,OpAddF64,OpAddF64Array, and so on are separate opcodes, and the JIT lowers each to a single arch instruction.
vm3jit is a method JIT with two backends: AArch64 (Apple Silicon and Linux ARM) and AMD64 (Linux and macOS Intel). The two backends share the same lowering passes and frame ABI. Phase 6 work over the last several weeks brought the AArch64 backend to closure on the BG kernel suite and matched the AMD64 backend op-for-op.
1.1 Benchmark headline
Final measurement against compiler3/corpus on an Apple M4, Go 1.26.3,
CPython 3.14.5, PyPy 7.3.17 (3.10), Lua 5.5, LuaJIT 2.1, repeat=5
medians:
| Program | N | vm3 (µs) | vm2 (µs) | CPython | PyPy | Lua | LuaJIT | Go | vm3 / Go |
|---|---|---|---|---|---|---|---|---|---|
bg/binary_trees | 10 | 22485 | 31038 | 152893 | 29110 | 187258 | 54504 | 18919 | 1.19x |
bg/fannkuch_redux | 10000 | 132 | 3938 | 7466 | 3974 | 2085 | 348 | 146 | 0.90x |
bg/fasta | 100000 | 915 | 2507 | 23205 | 4688 | 3583 | 1727 | 1619 | 0.57x |
bg/k_nucleotide | 100000 | 1456 | 29769 | 27762 | 7309 | 4916 | 1734 | 2819 | 0.52x |
bg/mandelbrot | 200 | 998 | 28036 | 56992 | 5857 | 20771 | 1718 | 1634 | 0.61x |
bg/n_body | 5000 | 364 | 16420 | 40338 | 9248 | 6482 | 521 | 251 | 1.45x |
bg/nsieve | 10000 | 6310 | 48717 | 25896 | 2943 | 10919 | 1880 | 914 | 6.90x |
bg/reverse_complement | 16384 | 66 | 29 | 3382 | 1642 | 721 | 307 | 40 | 1.65x |
bg/spectral_norm | 200 | 49 | 33852 | 60458 | 5188 | 32475 | 1145 | 914 | 0.05x |
lists/fill_sum | 100 | 101 | 3717 | 2726 | 1911 | 1885 | 676 | 93 | 1.09x |
maps/fill_sum | 100 | 426 | 8636 | 3708 | 3167 | 1575 | 566 | 1381 | 0.31x |
math/prime_count | 100 | 35 | 1728 | 2293 | 3400 | 702 | 260 | 105 | 0.33x |
math/sum_loop | 10000 | 2614 | 82729 | 143822 | 5779 | 30808 | 2556 | 3380 | 0.77x |
strings/concat_loop | 30 | 918 | 1019 | 586 | 1025 | 829 | 234 | 908 | 1.01x |
Reading the columns:
- vs vm2. Geometric mean speedup is about 12x across the rows
above. The largest wins are dispatch-bound math (
prime_count49x,sum_loop32x) and f64-heavy BG (spectral_norm690x,mandelbrot28x,n_body45x). The only regression isreverse_complement, which carries the MEP-39 hand-rolled super-op shape that we deliberately did not port to vm3. The generic bytes-bank lowering planned for MEP-40 §3.6 closes that gap. - vs CPython. vm3 wins every row by 5x to 1600x. PyPy is the
faster peer on long-running BG kernels but vm3 still beats it
everywhere except
nsieve N=10000, which is a workload PyPy's tracing JIT specializes for very well. - vs Lua and LuaJIT. Lua trails vm3 across the board. LuaJIT is
the closest peer: it ties or beats vm3 on
fannkuch_reduxandnsieve, both being tight integer loops that suit LuaJIT's trace recorder almost perfectly. On every other kernel vm3 wins by 1.5x to 23x. - vs Go. Nine of the fourteen rows are at-or-below 1.0x of Go
(vm3 is faster). Two more are within 1.2x. Only
nsieveis meaningfully slower at 6.9x, which is a known JIT gap onOpListPushthat closes in v0.11.1. Theprime_count3x lead over Go comes from vm3 emittingcmp/b.conddirectly while Go's SSA backend emits stack frame and bounds-check stubs on the inner loop. Thespectral_norm19x lead comes from vm3 hoisting the loop- invariant f64 constants and usingmovapsinstead ofmovsdto break the upper-half false dependency (see MEP-40 §6.3.4.n.11-n.13). Themaps/fill_sum3.2x lead comes from vm3's i64-keyed map being a 32-byte slab entry with no interface boxing, while Go'smap[int64]int64pays the per-entry header tax.
Full sweep results are checked in at bench/out/v0.11.0/crosslang-bg.md
and bench/out/v0.11.0/crosslang-math.md so anyone can reproduce them
by running go run ./bench/crosslang -repeat 5.
1.2 What the JIT compiles
After Phase 6.3.4 closeout, vm3jit covers:
- All i64 arithmetic, comparison, branch, and call opcodes on both
backends, including signed magic-multiply for
OpDivI64K/OpModI64Kand pow2 shortcuts. - All f64 arithmetic, including FMA fusion (
MulF64+Add/SubF64collapses toFMADD/FMSUBon ARM64 andVFMADD/VFNMADDon AMD64). F64 constant cache pre-loads loop-invariant constants into xmm scratch in the prologue. F64ArrayandI64Arraytyped-array opcodes including bounds- checked get/set, with data-pointer hoisting for constant-cell base indices.- Cell-bank ops:
OpListGetI64K,OpListSetI64,OpListPushI64,OpMapGetI64I64,OpMapSetI64I64,OpNewMapwith capacity hint,OpNewPair,OpPairFst,OpPairSnd,OpLookupI64KW. - Mixed-bank calls:
OpCallMixed,OpTailCallMixedwith self-call fast path. Self-recursive kernels (fact_rec,fib_rec,binary_trees) stay inside one trampoline call. - 256-entry switch-lookup table for
switch_lookup8_table-shaped dispatch.
The interpreter handles everything the JIT cannot compile. If a
function exceeds the JIT's register cap or contains an unlowered op,
CompileProgram silently leaves its JITCode nil and the interpreter
runs it at full interp speed. There is no warning, no fallback noise:
the JIT is a transparent accelerator.
1.3 Arena memory model
runtime/vm3 holds eleven typed arena tags: int, float, string, list,
map, set, struct, pair, closure, bytes, and any. Each arena is a slab
of fixed-size entries with a 12-bit generation field per slot. Handles
are 8 bytes: 4 bits arena tag, 12 bits generation, 48 bits slot index.
Generation bumps on slot reuse so a stale handle reads back as a
typed error rather than aliasing fresh data. Arenas.Reset() returns
every slab to its free list in one pass.
The bench harness can call Reset between invocations. Long-running
processes (REPL, language server) currently leak slabs until the
process exits. A reuse policy lands in v0.11.1; see MEP-40 §11.2.
2. Null safety: MEP-16 ships
Mochi now has option types with first-class language support. This work spans the parser, type checker, and runtime, and changes how nullable values are written and reasoned about.
2.1 null is now spelled none
var x: int? = none // was: var x: int? = null
if y != none { print(y) } // was: if y != null { print(y) }
null is removed from the keyword set. The migration is a one-line
sed:
find . -name '*.mochi' | xargs sed -i 's/\bnull\b/none/g'
This is the one breaking change in v0.11.0. Source that uses null
fails to parse with a clear error pointing at the new keyword. We
considered keeping both spellings for one release; the parser cost is
small but the type-checker cost is not, and Mochi is small enough that
a one-time migration is cheaper than carrying both names.
2.2 New operators
Three new operators:
?.safe call.xs?.firstreturnsnoneifxsisnone, otherwise callsfirstand wraps the result inT?. Chains likea?.b?.cshort-circuit on the firstnone.?[ ]safe index.m?["k"]returnsnoneifmisnone, otherwise evaluatesm["k"]. Index of a map returnsV?directly, som["k"]?.lengthis the common shape.??coalesce.x ?? 0returnsxifxis non-none, otherwise the right-hand value. The right-hand value is lazy: it is not evaluated if the left is non-none.
fun length_of(s: string?) -> int {
return s?.length ?? 0
}
2.3 Option-aware indexing and aggregates
The standard library is retyped:
map<K, V>indexing returnsV?, notV. Code that wasxs["k"] + 1becomes(xs["k"] ?? 0) + 1.first(list<T>)returnsT?.last(list<T>)returnsT?.find(list<T>, fun(T) -> bool)returnsT?.
The compiler can rewrite the old shape to the new one mechanically where the surrounding code can show the value is always present, but the safe default is to coalesce explicitly.
2.4 Option narrowing
The type checker narrows option bindings into their T form along
each path where the option can be proven non-none:
fun greet(name: string?) {
if name != none {
print(name.length) // narrowed: name : string here
}
if name == none { return }
print(name.length) // narrowed after early-return
let n = name ?? "anon"
print(n.length) // n : string from the coalesce
}
Narrowing applies inside:
ifconditions (if x != none,if x == none)&&and||chains (if x != none && x.length > 0)matcharms on option scrutineesletbindings after??coalesce- The two arms of a join with option-typed keys (
left join,right join,outer join)
Narrowing is invalidated when:
- The variable is reassigned (
x = something) - An impure call is made on the same path (
mutate(); x.lengthis no longer narrowed) - A closure captures the variable (capture entry resets the narrow)
The "impure call invalidates narrowing" rule depends on the effect system landing in Mochi (see §3). Functions marked pure can be called without invalidating narrows on captured options.
2.5 Option vs non-option comparisons
x == y where x: int? and y: int now fires error T059. The
intended forms are x == none, x == some(y), or (x ?? default) == y.
This catches a class of bugs where a missing value was being compared
to a sentinel and silently succeeding because both sides were boxed.
2.6 New error codes
| Code | Meaning |
|---|---|
| T057 | sort by / distinct on unordered or non-hashable type |
| T058 | option narrowed but used as non-option after invalidation |
| T059 | option vs non-option comparison |
3. Effects: MEP-15 ships stages 1 through 3d
Mochi grows a lightweight effect system. The goal is not a full algebraic-effects calculus. It is two narrower jobs: tell the checker when a call might mutate or do I/O, and let users annotate their own functions to be explicit.
3.1 The ! annotation
fun read_config() -> Config !io {
let f = open("config.yaml")
return parse(f.read())
}
fun pure_add(a: int, b: int) -> int {
return a + b
}
Effects appear after the return type, separated by !. The built-in
labels are io, mut, panic, and the wildcard any. Users can
declare custom labels.
The checker walks function bodies and infers effects bottom-up:
- A call to a
!iofunction makes the caller!iounless the caller is annotated otherwise. - A higher-order callback inherits its argument's effect set (this is
Stage 3d: a
fun(x) -> y !mutpassed tomapmakes themapcall produce!mut). - Untyped functions are inferred from their body.
- Pure functions (no annotation, no effect inference) can be called inside a narrowed-option scope without invalidating the narrow.
3.2 New error codes
| Code | Meaning |
|---|---|
| T044 | effect label not declared (widened in stage 3a to name the label) |
| T064 | function body has effect not declared in the signature |
| T065 | callsite expects pure but callee has effects |
| T066 | reserved for pure-position checks (stage 3c) |
4. Generics and subtyping: MEPs 11 and 12 closed out
Both MEPs are now Active in the index with a delivery-status section
in each file.
- MEP-11 (Subtyping and Variance) ships covariance for
list<T>andset<T>reads, contravariance forfun(T) -> Uarguments, and invariance everywhere else. The cast surface (expr as T) is pinned to the runtime behavior in MEP-11 §7. - MEP-12 (Parametric Polymorphism) ships generic functions with
fresh type variables per call.
concat,collect,push,append, andreverseare all retyped as parametric and routed through the unifier. The T048 message strips theTypeVarprefix so error output matches user-visible names.
5. ADTs and pattern matching: MEP-13 partial
Three pieces of MEP-13 landed:
- Recursive ADT support (a variant constructor can refer to its own type).
- Match irredundancy check (a
matcharm that cannot be reached fires an error). - Struct literal completeness (every field must be assigned, or the
literal must use
..restspread syntax). - Multi-missing variant exhaustiveness (a
matchover an ADT must cover every constructor, and the error names all missing ones).
The MEP is still Draft because the user-facing match grammar is
under active discussion; the surface that shipped in v0.11.0 is the
non-controversial core.
6. Query algebra: MEP-14 hardened
Several silent-success holes were closed:
sort byon an unordered type now errors (T056).select distincton a non-hashable type now errors (T057).whereandhavingclauses with non-bool predicates fire T033 / T042 directly, instead of inferringboolfromany.skipandtakewith non-int arguments error.selectwith a grouped aggregate returns[T], notT.left,right, andouterjoin cardinality is pinned, and the outer-side projection is option-retyped (per MEP-16 R10).
7. Soundness: MEP-10 holes B3, B6 closed
MEP-10 is the running list of soundness gaps. The B-series gaps closed in v0.11.0:
- B3b (call-arg invariance under aliasing): an argument that is aliased to a mutable field can no longer be passed where the callee's parameter is invariant.
- B3c (index/field aliasing widen hole):
xs[0]cannot be widened to a supertype whenxsis invariantly typed. - B3d (LHS index/field aliasing): same fix on the assignment LHS.
- B3e (literal element aliasing widen): list literals stop widening their element type past the declared bound.
- B6 (query select fallback): query select returns the precise
element type after MEP-12.4 (collect generic) instead of falling
back to
any.
8. New MEP drafts
Three new drafts opened on main:
- MEP-41 (Memory Safety) scopes capability handles and aligns the arena model with the CISA memory-safety roadmap.
- MEP-42 (Native Code Emission) generalizes vm3jit's copy-and- patch backend into a portable C-as-target AOT and Wasm-first cross- platform emission path.
- MEP-43 (Zero-Boilerplate Go Transpiler and Go FFI) designs a Go
transpiler and FFI built on
go/typesintrospection. No per-package hand-written shim, no runtime reflection registry. The legacyCall(name, args...)shape is foreclosed.
All three are Draft and their full text is at
mochi-lang.dev/docs/mep.
9. Tooling and harness
bench/vm3runneris the new crosslang subprocess for the vm3 stack. It mirrorsbench/vm2runnerand routes throughcompiler3.corpus + runtime/vm3 + runtime/jit/vm3jit.bench/crosslanggrows avm3column. The default-langsvalue isvm3,vm2,py,pypy,lua,luajit,go, so a bare invocation produces the full sweep.- The markdown renderer baselines ratios on the vm3 column when vm3 is run, falling back to vm2 otherwise. The output match check reports the offending lang label when peers disagree.
- The legacy interpreter package was archived and tree-sitter
dependencies were removed from
go.mod. The Mochi grammar is now hand-written inparser/, and the IDE plugins use Prism (web) and TextMate grammars (editors). This removes a 12 MB native dependency from every build and drops the static binary by about 8 MB.
10. Compatibility
- Breaking:
nullis renamed tonone. See §2.1 for the sed one-liner. - The default VM is
vm3. Pass-vm=vm2tomochi runto use the prior interpreter. The vm2 flag stays available for the v0.11.x series and is removed in v0.12.0. - New operators
?.,?[ ], and??are additions; existing source is unaffected. - The retype of
map[K]Vindex toV?, offirst/last/findto option-returning, and the various T057/T058/T059 errors are potentially breaking for code that was silently relying on missing- key behavior. The compiler error message points at the call site and suggests the??coalesce.
11. Upgrade
curl -fsSL https://get.mochi-lang.dev | sh
mochi --version # 0.11.0
Or, with Docker:
docker pull ghcr.io/mochilang/mochi:0.11.0
Or, from source:
git pull && make build
12. Acknowledgements
This release stitches together work from MEPs 5, 7, 10, 11, 12, 13, 14, 15, 16, 21, 23, 38, 39, 40, 41, 42, and 43. The phased plan in MEP-40 §10 ran for six months from the first vm3 cell layout PR to the closing AMD64 wide-K k_nucleotide commit. The MEP-39 §6.16 diagnostic apparatus was the load-bearing tool throughout: every phase update used the same residual-breakdown shape so progress was legible across PRs.
The next release, v0.11.1, picks up the nsieve Cell-bank gap, the
n_body and spectral_norm output-fold correctness items called out
in MEP-40 §15.4, and the bytes-bank lowering that closes
reverse_complement. The v0.12.0 cut starts the tracing JIT work
sketched at MEP-40 §11.6.