Skip to main content

MEP-46 research note 05, Codegen design: choosing an IR layer for Mochi → BEAM

Author: research pass for MEP-46 (Mochi → Erlang/BEAM transpiler). Date: 2026-05-22 (GMT+7). Target runtime: Erlang/OTP 27 LTS and OTP 28, with forward compatibility to OTP 29 (May 2026).

1. Why the choice of IR matters

The BEAM is unusual among managed runtimes in that it exposes not one but five plausible "front-doors" for a code generator. We can emit Erlang source text and shell out to erlc, we can hand the compiler an Erlang abstract format parse tree, we can drop one level down and emit Core Erlang, we can drop further to Kernel Erlang, or we can go all the way to BEAM assembly (.S) or a raw .beam binary. Each layer is real, each is reachable from outside the compiler, and each trades source-level fidelity against control over the generated bytecode.

The decision is load-bearing for a transpiler in three different ways. First, every optimisation pass below the chosen layer still runs for free; every pass above it must be reproduced by us. Second, the developer tools that ship with OTP (Dialyzer, the debugger, cover, observer, dbg, recon) attach themselves at particular layers, and emitting below those layers silently turns the tools off. Third, hot code loading and the JIT (BeamAsm, since OTP 24) interact with how the loader sees a module; if Mochi emits anything that the loader does not recognise as well-formed, we lose both hot reload and the JIT.

This document surveys all six candidates and lands on a recommendation.

2. The candidate IR layers

The full pipeline used inside OTP 27/28 (lib/compiler/src/) is, in order:

  1. Source text (.erl)
  2. Erlang abstract format (erl_parse parse tree, also called "abstract code")
  3. Core Erlang (cerl module, .core listing, v3_core pass produces it)
  4. Kernel Erlang (v3_kernel pass, no public file format)
  5. BEAM SSA (since the OTP 22 SSA rewrite: beam_kernel_to_ssabeam_ssa_optbeam_ssa_pre_codegenbeam_ssa_codegen)
  6. BEAM assembly (.S listing)
  7. BEAM binary (.beam file, loaded by code:load_binary/3)
  8. Native machine code (BeamAsm JIT, x86-64 and aarch64, at load time)

compile:forms/2 accepts either layer 2 or layer 3 as input, with from_core selecting Core Erlang and from_abstr selecting the abstract format. from_asm reads layer 6 from a file. There is no public entry point for Kernel Erlang or for the SSA layer: those are deliberately internal. The candidate IRs from the design brief therefore map as follows:

CandidateOTP layerPublic entry point
Erlang source text1compile:file/2
Erlang abstract format2compile:forms/2
Core Erlang3compile:forms/2 with from_core
Kernel Erlang4none (internal)
BEAM assembly (.S)6compile:file/2 with from_asm
BEAM bytecode7beam_asm, code:load_binary/3
Custom NIFn/aC / Rust runtime via erl_nif

3. Layer-by-layer analysis

3.1 Erlang source text (.erl)

This is what Elixir and Gleam emit. Pretty-print the program as Erlang text, write it to a temp directory, and shell out to erlc (or invoke compile:file/2 in-process). Documentation: the Erlang reference manual at erlang.org/doc/system/reference_manual.html.

Stability: the source language is the most stable thing on the BEAM. The atoms maybe, the 0.0 / -0.0 distinction, and triple-quoted strings (EEP 64) all changed between OTP 26 and OTP 27, but those changes are tiny relative to anything below. Cross-version stability OTP 26 → 27 → 28 → 29 is essentially perfect for a code generator that controls its own output (we never write maybe as an atom, we never compare floats with =:= against 0.0).

Optimisations preserved: all of them. Every compiler pass runs. Inlining, type inference, binary matching specialisation, the SSA optimiser, tuple-element specialisation in BeamAsm, everything.

Tooling: full. Dialyzer reads the abstract code chunk (debug_info) produced by the same compile, the debugger sets breakpoints on line numbers, cover instruments by line, observer shows module names, dbg traces by Module:Function/Arity. Hot code loading: native, no caveats. JIT: full participation.

Downsides for Mochi specifically:

  • String generation is a serialisation tax. We build an AST, print it as text, then erl_parse re-parses it. The round-trip can be 30 to 40 percent of total compile time for a small module (Gleam reports the Erlang back-end takes the bulk of compile time on its target, because the Erlang parser then has to be invoked).
  • Pretty-printing requires re-quoting. Mochi identifiers like for' or those containing dots have to be escaped to legal Erlang atoms (single-quoted), and we have to keep track of which identifiers we mangled.
  • Source positions get lost or fabricated. Every Erlang line number we emit is synthetic, which means Dialyzer warnings, stack traces, and cover reports point into a file the user never wrote.
  • No way to express Core-Erlang-only constructs. letrec, apply to a function value, primop, and inter-module fully-qualified calls are all available in Core Erlang but only indirectly in Erlang surface syntax (fun M:F/N and similar). Some Mochi constructs (mutually-recursive local closures, primops for our LINQ runtime) translate awkwardly through the surface layer.

Real-world systems at this layer: Elixir (after macro expansion, Elixir does emit through compile:forms/2 with the abstract format, not source text, but conceptually it is the same layer), Gleam (true source text), Caramel (Erlang source), Erlog/Lux (Erlang source). The pattern is widespread for one reason: it is the lowest-effort path.

3.2 Erlang abstract format

The abstract format is the term-tree that erl_parse:parse_form/1 returns, documented at erlang.org/doc/apps/erts/absform.html. A function head looks like {function, ANNO, Name, Arity, Clauses}; a case looks like {'case', ANNO, Expr, Clauses}; atoms are {atom, ANNO, foo}; and so on. compile:forms/2 accepts a list of these. Elixir's elixir_erl.erl builds exactly this representation and hands it to compile:forms/2.

Stability: the format is informally stable but explicitly versioned. The abstract_code chunk has been tagged {raw_abstract_v1, AbstractCode} since OTP R9C; the version tag exists precisely because Ericsson reserves the right to change it. The abstract format docs say "the format of such terms can change between releases", and they have: maps, binary comprehensions, the maybe expression, type annotations, and -spec syntax have all extended the format over the last decade. None of those changes broke existing tag patterns, but they did add new ones.

Optimisations preserved: all of them. The compiler's first real pass is v3_core which lowers abstract format to Core Erlang, so feeding it abstract format is identical to feeding it source (the parser is skipped, nothing else).

Tooling: full. The abstract code chunk is what Dialyzer wants. The debugger ultimately reads abstract code. Source locations are first-class (ANNO carries line and column), and we can put any annotation we like on a node.

Hot code reload: full. JIT: full.

This layer is the canonical "I am a BEAM language frontend" choice. Elixir uses it, and Elixir is the BEAM's largest non-Erlang language.

Downsides for Mochi:

  • The format is Erlang-shaped. Multi-value returns, fully-qualified primops, and explicit letrec semantics are not in the surface, so we still pre-lower them ourselves into ordinary case/fun/apply expressions before printing them as abstract terms.
  • Pattern matching is still surface-syntax patterns. All exhaustiveness work happens later, in v3_kernel. That is fine, but if we want to surface Mochi's exhaustiveness diagnostics ourselves, we will be running our own decision tree compiler anyway and there is little benefit to feeding the compiler patterns it must re-analyse.
  • Some constructs require parse-transform-like ceremony. For example, try ... catch ... after in abstract form is a chunky nested tuple; macros and records are gone by the time the abstract reaches the compiler but record_info/2 references need special handling.

3.3 Core Erlang

Core Erlang is the documented, stable intermediate language between v3_core and v3_kernel. Specification: Carlsson, Gustavsson, Johansson, Lindgren, Nystrom, Pettersson, Virding, Core Erlang 1.0.3 Language Specification, November 2004, hosted at https://www.it.uu.se/research/group/hipe/cerl/doc/core_erlang-1.0.3.pdf. The OTP team also publishes a tutorial blog post, Core Erlang by Example at https://www.erlang.org/blog/core-erlang-by-example/, and the cerl module reference at https://www.erlang.org/doc/apps/compiler/cerl.html documents the constructor API.

Stability. Core Erlang is explicitly described as the layer "less complicated than Erlang, more suited than the abstract format for code analyzing tools (such as Dialyzer) and optimizers." The 1.0.3 spec has been the same since 2004. New Erlang features (maps, binary comprehensions, the maybe expression) get lowered into existing Core Erlang constructs by v3_core; the Core layer itself almost never changes. The to_core listing option has worked identically since at least OTP 17. Cross-version stability OTP 26 → 27 → 28 → 29: excellent; nothing about how compile:forms/2 consumes Core Erlang has shifted.

The cerl constructor API. This is the public, documented Erlang module for building Core Erlang trees programmatically. The constructor surface (from erlang.org/doc/apps/compiler/cerl.html):

Module-level:

  • cerl:c_module(Name, Exports, Definitions) and cerl:c_module(Name, Exports, Attributes, Definitions) build a module. Name is c_atom(foo), exports are a list of c_fname/2 results, definitions are {c_fname, c_fun} pairs.
  • cerl:c_fname(Name, Arity) — abstract function-name variable, syntactic sugar for c_var({Name, Arity}).

Bindings and lambdas:

  • cerl:c_var(Name) — a variable (Name is an atom, integer, or {atom, integer}).
  • cerl:c_fun(Vars, Body)fun (V1, ..., Vn) -> Body.
  • cerl:c_let(Vars, Arg, Body)let <V1, ..., Vn> = Arg in Body. Note Core Erlang's let is multi-value, which is exactly the right primitive for Mochi's destructuring let.
  • cerl:c_letrec(Defs, Body) — recursive let with mutually-recursive function definitions. There is no surface analogue; this is one of the reasons to emit at this layer.
  • cerl:c_seq(Arg, Body) — sequencing (evaluate Arg for side effect, return Body).

Control flow:

  • cerl:c_case(Arg, Clauses) — case expression.
  • cerl:c_clause(Patterns, Body) and cerl:c_clause(Patterns, Guard, Body) — a single clause.
  • cerl:c_alias(Var, Pattern) — the Var = Pattern aliasing form.
  • cerl:c_receive(Clauses) and cerl:c_receive(Clauses, Timeout, Action) — receive expression.
  • cerl:c_try(Arg, Vars, Body, ExcVars, Handler) — try/catch.
  • cerl:c_catch(Body) — the older catch expression.

Calls:

  • cerl:c_call(Module, Name, Args) — a fully qualified inter-module call. In Core Erlang every BIF is just a fully qualified call: arithmetic is call 'erlang':'+'(X, Y), length/1 is call 'erlang':'length'(L). This is the principal mechanism for invoking BIFs from generated code.
  • cerl:c_apply(Op, Args) — intra-module application (Op is a c_fname or c_var).
  • cerl:c_primop(Name, Args) — compiler primop. c_primop(c_atom( 'match_fail'), [...]) is how v3_kernel builds the exception that a non-matching case raises; c_primop(c_atom('raise'), [Class, Reason]) is how try re-raises. Most Mochi runtime errors will go through match_fail and a small set of custom primops.

Data:

  • cerl:c_tuple(Es), cerl:c_cons(H, T), cerl:c_nil(), cerl:c_atom(A), cerl:c_int(N), cerl:c_float(F), cerl:c_binary(Segments), cerl:c_bitstr(Val, Size, Unit, Type, Flags), cerl:c_values(Es) (a multi-value tuple, used for the right-hand side of let).
  • cerl:make_list/1, cerl:make_list/2, cerl:abstract/1 are conveniences for building a Core Erlang literal from an arbitrary Erlang term.

Maps are handled by cerl:c_map/1 and cerl:c_map_pair/3 (and their c_map_pattern/1 siblings for the pattern side). The cerl_trees:map/2 and cerl_trees:fold/3 traversals plus cerl_clauses give us first-class clause manipulation.

For Mochi's ADTs the natural lowering is: each constructor becomes a tagged tuple {c_tag, F1, ..., Fn} constructed with cerl:c_tuple([ c_atom(c_tag), F1, ..., Fn]), and exhaustive pattern matching becomes a c_case whose final clause primops a match_fail (we generate exhaustiveness ourselves so we know the final clause is dead, but emitting it preserves Dialyzer's "no_match" silence). Closures become c_funs captured by c_let. LINQ pipelines become a chain of c_let bindings to runtime helper calls in our mochi_query module.

compile:forms/2 entry shape for Core Erlang. With from_core in the options list, compile:forms/2 accepts a single c_module() value (not wrapped in a list, despite the name; this is a quirk of forms/2 for the core path). Useful options:

  • from_core — input is Core Erlang.
  • binary — return a binary instead of writing a file (implicit for forms/2).
  • debug_info — store the abstract code chunk. With Core Erlang input, the abstract chunk we get back is the Core tree, tagged appropriately, which Dialyzer accepts.
  • no_lint — skip the lint pass. We rely on this because lint runs on abstract format only.
  • no_spawn_compiler_process — compile in-process (cheaper for batch compilation).
  • return_errors, return_warnings — structured diagnostics.
  • to_kernel, to_asm, S, E, P — listing options for debugging the generated module.
  • inline, inline_list_funcs, no_copt, etc. — ordinary optimisation flags work unchanged.

Optimisations preserved when emitting at Core Erlang. Everything that runs after v3_core:

  • sys_core_fold: constant folding, case simplification, dead-clause elimination, beta reduction.
  • sys_core_alias: alias analysis.
  • sys_core_bsm: binary matching reshape (lifts repeated binary matches into a single match context).
  • core_transforms: user parse-transforms expressed at the Core level.
  • v3_kernel: pattern matching compilation to decision trees, variable scoping flattening, generation of explicit match_fail exits.
  • beam_kernel_to_ssa: lowering to SSA.
  • beam_ssa_opt: the heavy SSA optimiser (type-based dead test elimination, inter-procedural-on-local-calls type narrowing, tail-call detection, copy propagation, dead store elimination, alias-based binary specialisation).
  • beam_ssa_bool: short-circuit boolean optimisation.
  • beam_ssa_bsm: binary matching specialisation.
  • beam_ssa_funs: fun-creation lifting.
  • beam_ssa_pre_codegen: linear-scan register allocation.
  • beam_ssa_codegen: SSA → BEAM assembly.
  • beam_a, beam_block, beam_jump, beam_clean, beam_validator: final peephole and validation.

What is lost by emitting at Core Erlang versus the abstract format. Only v3_core itself. v3_core does:

  • Lower records to tuples (irrelevant; Mochi has no records, only its own struct lowering).
  • Lower list/binary comprehensions to recursive funs (we can target the lowered form directly).
  • Lower if, guards-with-orelse, and the maybe ... else end expression to nested case (we can target the lowered form).
  • Source-location-driven exception attribution (we set Core annotations manually).
  • Lint-time error messages (we run our own type checker so this is fine).

That is a strictly small loss.

Tooling at Core Erlang.

  • Dialyzer: works. Dialyzer actually reads the abstract code chunk and then lowers it to Core itself for analysis. When debug_info is on and we generate from Core, the abstract code chunk is stored as {debug_info, {Backend, Data}} where Backend is core_erlang style. Dialyzer follows this transparently; this is the mechanism described in the OTP docs ("The format of debug information that is stored in BEAM files has been changed... to better support other BEAM-based languages such as Elixir or LFE."). If we want richer specs, we can store a custom backend.
  • Debugger: line-number-driven. We must thread file/line annotations through every cerl node we build (the ann_c_* variants). With annotations, the debugger can set breakpoints in the original Mochi source; without them, the debugger refuses to attach.
  • Cover: depends on parse-transform applied at the abstract level. Coverage at the Core level requires emitting our own coverage-style instrumentation or using cover_compile with a custom backend. This is a real cost.
  • Observer, dbg, recon: all module-level and function-level; they work fine regardless of source IR.
  • Stack traces: driven by file and line annotations on call sites. The Core Erlang call form carries annotations; we propagate them.

Hot code reload at Core Erlang. Identical to source. Hot code reload sees only the loaded module image and the two-version-per-module invariant; the IR we emitted from is invisible at this point. For funs, the OTP requirement that funs captured across upgrades use fun Module:Function/Arity form rather than anonymous funs translates to: we must lower anonymous Mochi closures that escape across gen_server boundaries to named c_fname values in a c_letrec, not to anonymous c_funs captured by closure. This is a Mochi code-gen rule independent of IR layer, but Core Erlang gives us the cleanest way to enforce it.

JIT at Core Erlang. BeamAsm runs at load time and consumes BEAM bytecode produced by beam_ssa_codegen. Emitting at Core Erlang means we feed the SSA optimiser the same input shape it sees from Erlang source, and BeamAsm therefore sees the same kinds of patterns it has been tuned for (tuple element fetches, immediate-tagged integer arithmetic, BIF call sequences). BeamAsm does almost no cross-instruction optimisation by design (erlang.org/doc/apps/erts/beamasm.html: "BeamAsm does hardly any cross instruction optimizations"); the heavy lifting happens in beam_ssa_opt. So the question "does emitting at a higher IR lose JIT opportunities" reduces to "does emitting at Core Erlang produce SSA that the optimiser likes". The answer, based on Elixir's and LFE's experience, is yes, provided we keep our generated Core close to what v3_core itself would produce.

Real-world systems emitting at Core Erlang: LFE (Lisp-Flavoured Erlang, by Robert Virding, one of Erlang's co-creators and a Core Erlang co-author) targets Core Erlang directly through its 3-pass front-end (macro expansion, lint, code generation). Elchemy historically targeted Core Erlang. The harp-project Core Erlang formalisation at https://github.com/harp-project/Core-Erlang-Formalization treats it as a stable target. Mochi's profile (statically typed, ADTs, closures, optionals) maps to Core Erlang the same way LFE's profile does; the precedent is direct.

3.4 Kernel Erlang

The v3_kernel pass takes Core Erlang and produces Kernel Erlang, which is "a flat version of Core Erlang with a few differences. For example, each variable is unique and the scope is a whole function. Pattern matching is compiled to more primitive operations" (BEAM Book, Stenman). This is also where the Maranget-style decision-tree pattern matching compiler lives: v3_kernel.erl implements an Augustsson-style match compiler with adjustments for Erlang's send/receive and guards. The exact lineage is closer to Simon Peyton Jones's Implementation of Functional Programming Languages than to Maranget's 2008 paper, but the output (a decision tree that tests each subterm at most once) is the same shape.

Stability: explicitly none. The BEAM Book notes "the kernel representation does not have a well defined file format". There is no from_kernel option in compile:forms/2. The Kernel Erlang record definitions live in lib/compiler/src/v3_kernel.hrl inside OTP and have changed multiple times in the SSA era.

Documentation: source code only. Tooling: none consumes Kernel Erlang as input. Hot reload / JIT: irrelevant because we cannot enter at this layer.

Verdict: not a candidate. Targeting it would mean shipping a fork of lib/compiler with every Mochi release.

3.5 BEAM assembly (.S)

The to_asm listing produces a .S file: a sequence of {Op, Arg, ...} tuples in Erlang term syntax. from_asm reads it back. The format is real but the compiler docs say plainly: "the format of assembler files is not documented, and can change between releases".

Stability: low. The set of opcodes and their operand shapes has changed in essentially every major OTP release: bs_init2 was split, make_fun3 replaced make_fun2, the swap instruction was added in OTP 25, update_record in OTP 26, several bs_* opcodes were renamed in OTP 27, and OTP 28 added executable_line for finer-grained tracing. Each of these is a quiet breaking change for a code generator that emits at this level.

Documentation: the BEAM Book (Stenman, The BEAM Book, blog.stenmans.org/theBeamBook, 1.0 released 2025) covers the instruction set in detail. The BEAM Wisdoms wiki (Lytovchenko, beam-wisdoms.clau.se) is the secondary reference. Neither is normative; both lag the current OTP by a release or two.

Optimisations preserved: only the final-stage peepholes (beam_block, beam_jump, beam_clean). All SSA-level optimisations have already happened by the time .S is produced; if we synthesise .S ourselves we re-implement them or do without them. In particular, we are responsible for our own register allocation, tail-call recognition (call_only vs call_last vs call), and frame size minimisation.

Tail calls at this layer: this is where the question "when does the BEAM emit call_only vs call?" finally bites. call_only is the tail call when no Y-register stack frame exists; call_last N L U is the tail call when a frame of U Y-register slots must be deallocated before the jump; call is the non-tail call that pushes a continuation pointer. beam_ssa_codegen chooses between them based on whether the call is in tail position in the SSA graph and on the live Y-register set at that point. If we emit .S ourselves we must reproduce this analysis; if we emit Core Erlang, beam_ssa_codegen does it for us. Mochi has explicit tail-call requirements (LINQ pipeline tails, agent loops) and we want the compiler doing this.

Tooling: partial. beam_validator will run and reject malformed .S. Dialyzer cannot; it has no abstract code chunk to read unless we synthesise one. The debugger cannot attach. cover cannot instrument.

Verdict: high cost, narrow benefit. Pass.

3.6 BEAM bytecode (.beam) directly via beam_asm

The beam_asm module assembles .S-shaped instructions into a .beam binary. The codec-beam Haskell library (hackage.haskell.org/package/codec-beam) demonstrates the same thing from outside OTP. We can produce .beam files entirely without erlc.

Everything that was bad about .S is worse at .beam. We additionally take responsibility for chunk layout, the Atom/AtT8 chunk for atoms, LitT/Lit for literals, the FunT chunk for funs, the LocT and ExpT chunks for exports and locals, the Line chunk for source locations, the Dbgi/Abst chunks for debug info, the StrT chunk for string literals in binaries, and the Attr chunk for module attributes. Each chunk's binary layout is documented in the BEAM Book but is not normative OTP API.

Real-world precedent: Lumen attempted this with WebAssembly as the target. Lumen is now archived; the maintenance cost was a contributing factor. codec-beam exists but is research-grade.

Verdict: pass. The only reason to emit .beam directly would be sub-millisecond compile times, which Mochi does not need.

3.7 Custom NIF runtime

Write the entire Mochi runtime in C or Rust, expose a small surface to Erlang as a single big NIF, and translate Mochi source to a tiny stub that calls into the NIF. This is what some experimental BEAM-targeted languages have flirted with.

The argument against is fundamental: NIFs do not yield to the BEAM scheduler. A computation that takes longer than ~1 ms inside a NIF blocks the scheduler thread, degrading the soft-real-time guarantees that are the BEAM's reason for existence. Dirty NIFs (introduced in OTP 17) help but cap the number of concurrent long NIFs by the number of dirty scheduler threads, which defeats Mochi's concurrency model. Hot code loading also does not work cleanly across NIF library reloads. Dialyzer cannot see into NIFs. Tracing only sees the entry point.

NIFs are correct for FFI (Mochi calling C libraries, e.g. SIMD numerics) but wrong as the primary code generator. Verdict: pass for the main path; revisit as an FFI mechanism only.

4. Pattern matching, BIFs, tail calls, and the JIT in detail

4.1 Pattern matching: at what layer is the decision tree built?

The Maranget-style decision tree is built in v3_kernel. v3_core keeps patterns roughly as they appeared in source. v3_kernel takes the Core Erlang c_case plus its list of c_clauses and produces a k_match tree where every test happens at most once on each subterm. This means:

  • Emitting at Erlang source or abstract format: the kernel pass compiles our patterns. We get good decision trees for free.
  • Emitting at Core Erlang: same. v3_kernel is downstream.
  • Emitting at Kernel Erlang or lower: we have to build the decision tree ourselves.

Mochi has its own exhaustiveness checker (running on Mochi ADTs before lowering). The exhaustiveness analysis is independent of the BEAM decision-tree compiler. We do not need to do match compilation ourselves; we just need to feed v3_kernel clauses in a shape it can compile well. Core Erlang is exactly that shape.

4.2 BIFs

In Core Erlang every BIF invocation is a c_call(c_atom(erlang), c_atom(Name), Args). The compiler recognises certain BIFs (arithmetic, type tests, length/1, element/2, tuple_size/1, is_* family, setelement/3, the comparison operators) and emits specialised instructions like gc_bif1/gc_bif2 or bif1/bif2 with direct C-level calls in the emulator and direct native dispatches under BeamAsm. The recognition table lives in beam_makeops and beam_ssa_opt. We invoke BIFs from Mochi by emitting fully qualified calls to the erlang module; the compiler picks up the rest. The same is true for lists:reverse/1, maps:get/2, etc., which are normal calls but get inlined or specialised by beam_ssa_opt when arguments have known types.

4.3 Tail calls

The BEAM emits call_only N L when:

  1. The call is syntactically in tail position.
  2. The current function has no stack frame (no Y registers in use after the call).

It emits call_last N L U when:

  1. The call is in tail position.
  2. A frame of U slots exists and must be deallocated.

It emits call N L otherwise.

The analysis happens in beam_ssa_codegen. The IR that needs to preserve tail position is anything where the optimiser can still see that a call is the last expression. Core Erlang's c_apply and c_call preserve this trivially: any call that is the body of a c_let's body (rather than its argument) and is the entire result of the enclosing function is in tail position. The SSA optimiser is conservative; if we emit something obscure (e.g. wrap a tail call in a useless c_seq) we lose tail-call status. Mochi's lowering rule: the body of a function that ends in a match whose arms each end in a recursive call must emit those calls as the tail expression of the corresponding c_clause body.

4.4 JIT (BeamAsm)

BeamAsm shipped in OTP 24, with major refinements in OTP 26 (much- improved tuple/atom test fusion) and OTP 27 (aarch64 production support, improved frame pointer handling for perf integration), and continued JIT work in OTP 28 (lower-overhead tracing, better perf integration via dwarf symbols).

BeamAsm consumes BEAM bytecode at module load time and emits native code via the asmjit library. It does not reach back into Core Erlang; everything it knows about a module comes from the loaded bytecode. The implication: emitting at a higher IR (Core Erlang) loses nothing relative to emitting at a lower IR, as long as the SSA pipeline produces good bytecode. It produces good bytecode for any reasonable Core input. Emitting at .S or .beam directly can in principle hand-tune the bytecode further, but we would be racing the OTP compiler team who tunes the SSA optimiser specifically against BeamAsm patterns. We will not win that race.

4.5 Hot code reload

Two-version invariant is the rule (erlang.org/doc/system/code_loading.html): a module has at most two co-existing versions, "current" and "old". A third load purges the old version (and the processes still executing in it). For Mochi this means:

  • Every Mochi module compiles to exactly one BEAM module.
  • Fully qualified calls (mochi_mod:func/N) always go to "current"; this is what we generate from Core Erlang c_call.
  • Funs that survive an upgrade must be named, i.e. fun Module:Function/Arity. This is c_call-shaped, not c_fun-shaped. Mochi closures that get stored in a process state need to be lowered to named top-level functions plus a {Module, Function} capture, not anonymous funs. This is a lowering policy independent of IR choice, but Core Erlang gives us the named primitive (c_fname referenced through c_call) cleanly.
  • The -on_load/1 directive (handy for Mochi's runtime initialisation, e.g. ETS table setup) is an ordinary Erlang module attribute and appears in Core Erlang as a module attribute on c_module.

5. Versioning notes (OTP 26 → 27 → 28 → 29)

LayerOTP 26 → 27OTP 27 → 28OTP 28 → 29 (May 2026)
Source .erlmaybe becomes reserved; triple-quoted strings; 0.0 not =:= -0.0escripts compile-only; re switches librarynew feature gates per EEP process
Abstract formatnew tags for maybe and triple-quoted strings; raw_abstract_v1 unchangedtag stabletag stable
Core Erlangunchangedunchangedunchanged
Kernel Erlangrecord layout changedrecord layout changedrecord layout changed
BEAM .Supdate_record addednew executable_line, bs_* renamesfurther bs_* reshape expected
BEAM .beamchunk layout stable; opcode table churnsopcode table churnsopcode table churns

Core Erlang's flat line of "unchanged" through three OTP majors is the practical reason to pick it.

6. Name mangling, deterministic ordering, and the #line analogue

Three concerns that are universal regardless of IR choice but particularly load-bearing for Core Erlang because the constructed AST has no implicit "source order" the way source text does.

6.1 Deterministic mangling

Mangled identifier form (see note 06 §4 for the full table):

mochi_{pkg_underscored}__{name}[__{instArgsHash6}]

where pkg_underscored is the source package path with / replaced by _, name is the source identifier, and instArgsHash6 is the first 6 hex digits of a BLAKE3 hash over the canonical printing of generic instantiation arguments (omitted for non-generic symbols). Two emitted identifiers never collide across packages or generic instantiations.

6.2 Definition ordering

cerl:c_module/3 takes a list of definitions. The order in which we list them determines the BEAM file's function table order, which determines the .beam chunk layout, which determines the SHA-256 of the output file. For reproducibility (see note 07 §7) we sort definitions by canonical identifier printing.

6.3 The #line analogue

C has #line to map emitted positions back to source. Core Erlang has the annotation system: every cerl node accepts annotations via cerl:set_ann(Node, AnnList). We attach {file, "src/foo.mochi"} and {line, N} annotations to every emitted node. These propagate to:

  • The Line chunk of the .beam file (consumed by stack traces and cover).
  • The Dbgi/Abst chunk (consumed by Dialyzer and the debugger).
  • The exception attribution code in erlang:get_stacktrace/0.

Without these annotations, Dialyzer warnings point to invented line 0 and stack traces are useless. With them, the user sees Mochi source positions throughout.

7. Recommendation for MEP-46

Mochi should emit Core Erlang via compile:forms/2 with from_core and debug_info.

The justification is layered:

  1. It is documented and stable. The Core Erlang 1.0.3 specification (Carlsson et al., November 2004) is the published reference, hosted at the HiPE group's Uppsala URL. The cerl module is the official Erlang API for building Core Erlang trees, documented at erlang.org/doc/apps/compiler/cerl.html. The OTP team publishes a tutorial at erlang.org/blog/core-erlang-by-example/. Three OTP majors (26, 27, 28) have shipped without changing the layer.

  2. It preserves source-level structure where we want it. Pattern matching is still by clauses on a c_case; v3_kernel builds the decision tree for us. Tail calls remain visible to beam_ssa_codegen, which picks the right call_only / call_last / call form. BIFs are plain c_call(c_atom(erlang), c_atom(Name), Args) and the compiler specialises them. Closures are c_fun plus c_let, with mutual recursion via c_letrec. None of this is reproducible from BEAM assembly or bytecode without re-implementing 60-80% of lib/compiler.

  3. It lets the kernel pass and the JIT do their work. Everything downstream of v3_core runs: sys_core_fold, sys_core_bsm, v3_kernel, the SSA optimiser, register allocation, BeamAsm at load time. The cited Elixir and LFE comparisons show this is enough to be within a few percent of hand-written Erlang on the benchmarks BeamAsm was tuned for.

  4. It does not require us to do our own register allocation. beam_ssa_pre_codegen runs linear-scan register allocation over the SSA form. Emitting at .S or .beam would require us to reimplement this; emitting at Core Erlang inherits it.

  5. Dialyzer works. With debug_info on, the abstract code chunk is stored as {debug_info, {Backend, Data}} and Dialyzer accepts our backend tag. Our types lower to -spec annotations on the Core module's attributes, which Dialyzer reads through the same mechanism.

  6. Hot code reload and the two-version model are transparent. Core Erlang has no involvement at load time; the loader sees only the final .beam and applies the standard rules. As long as our lowering rules for closures-that-escape produce named functions (via c_fname references), code upgrades just work.

  7. Tooling integration is at least as good as Elixir's. The debugger reads our line annotations (we set them via cerl:set_ann/2 on every constructed node). Coverage requires more work than Erlang source because cover parse-transforms abstract code, not Core; we will likely implement Mochi-side coverage by emitting our own instrumented Core, similar to how Elixir handles mix test --cover.

  8. Precedent. LFE has emitted Core Erlang in production since 2008, written by one of Core Erlang's co-authors. The HARP project formalises the same layer in Coq/Isabelle. There is a community that maintains the layer specifically because non-Erlang languages depend on it.

Implementation plan in three sub-phases:

  • 6.1: Build a typed Mochi-IR → cerl translator. The Mochi compiler is in Go; we have two architectural choices. Option A: serialise cerl trees as Erlang external term format from Go and ship them to an embedded erlc worker via erl_call. Option B: write the translator in Erlang as a parse-transform-style backend, invoked from Go via port. Recommended: Option B. The cerl constructor API is only ergonomic from Erlang itself; the port boundary is the smaller surface.
  • 6.2: Wire compile:forms/2 with from_core, debug_info, return_errors, return_warnings, no_lint, binary. Emit one BEAM file per Mochi module.
  • 6.3: Annotate every cerl node with {file, MochiSource} and {line, MochiLine} so debugger/cover/dialyzer/stack-traces all attribute back to Mochi source.

What we explicitly do not do:

  • Do not emit Erlang source text. The pretty-print round-trip costs 30-40% of compile time, fabricates line numbers, and forces us through identifier escaping we do not control.
  • Do not emit .S or .beam. We would inherit the burden of register allocation, tail-call detection, opcode-table version tracking, and chunk-layout maintenance, in exchange for negligible runtime gain.
  • Do not write a custom NIF for the Mochi runtime. NIFs are right for FFI and wrong for control flow; they break scheduler fairness and tracing.
  • Do not target Kernel Erlang. It has no public entry point, no stable format, and no documentation.

Backstop. If a future OTP release (29.1, 30) changes Core Erlang in a way that affects Mochi (very unlikely given the 22-year track record but not impossible), the fallback is to emit at the abstract format layer via the same compile:forms/2 entry point. This is a one-week migration and Elixir's working precedent shows it is sustainable. We carry one fallback path in the codegen design from the start; we ship Core Erlang as the default.

Sources