Skip to main content

02. Design philosophy

Why a bidirectional bridge

The bridge is bidirectional for the same reason MEP-73 (Rust) and MEP-76 (Ruby) are bidirectional: a language bridge that only flows one direction leaves the ecosystem fragmented. Mochi users who write a library that is genuinely useful to Erlang developers should be able to publish it to Hex.pm without leaving the Mochi toolchain. Mochi users who need to call into OTP infrastructure (cowboy, ranch, telemetry, etc.) should be able to do so without writing Erlang boilerplate. The bidirectional design is also self-validating: the publish direction (TargetErlangPort) exercises the same ETF encoding/decoding stack as the consume direction, which means both directions are tested end-to-end against the Hex.pm ecosystem.

Why BEAM abstract code

The primary type source for the bridge is the BEAM abstract code chunk embedded in each .beam file. The alternatives were Dialyzer PLT and EDoc XML.

BEAM abstract code (Dbgi chunk, OTP 20+; Abst chunk, OTP 17-19) contains the full abstract syntax tree of the Erlang module, stored as compressed ETF. Every -spec, -type, -opaque, and -record definition the module author wrote is present verbatim. Decoding the chunk requires only an ETF parser in Go, no Erlang runtime, no tool invocation. The beam-ingest-sha256 hash in mochi.lock anchors the ingest result to the exact bytes of the chunk, making the lock reproducible across environments and OTP version upgrades.

Dialyzer PLT (persistent lookup table) is a binary file produced by running dialyzer --build_plt. It contains Dialyzer's inferred types for every function it analysed, including functions without author-declared -spec annotations. The coverage is broader than author-declared specs, but the type is an inferred approximation, not an authoritative declaration. More critically, the PLT is produced by running the Erlang VM: mochi pkg lock would need to spawn dialyzer as a subprocess, which requires Erlang to be installed, takes seconds to minutes per package, and produces output that changes across OTP versions. The BEAM abstract code approach is lighter (decode bytes, no runtime required), more reproducible (the chunk is part of the published artifact), and more authoritative (author-declared over inferred).

EDoc XML is generated by rebar3 edoc or edoc:run/2. It parses @spec tags from Erlang source file comments and emits XML. EDoc coverage exists for packages that pre-date the -spec typespec system (OTP's -spec syntax was standardised in OTP-R13B04, 2009; many older packages used @spec comment tags instead). The XML is readable in Go without an Erlang runtime. However, EDoc's type annotation syntax is a free-text human-readable format, not a formal grammar: @spec function(integer(), binary()) -> {ok, pid()} | {error, atom()} is a string that EDoc parses with heuristics, not a machine-readable AST. The coverage is lower, the parsing is noisier, and the information per entry is less precise than BEAM abstract code. EDoc XML is therefore used as a fallback (phase 3) only for packages that ship neither -spec annotations nor Dbgi/Abst chunks with parseable type information.

Why OTP Port over NIF or C-node

Erlang offers three mechanisms for integrating non-BEAM code: Ports, NIFs, and C-nodes.

OTP Ports (open_port/2) spawn an external OS process and communicate via stdin/stdout with a 4-byte length prefix ({packet, 4} mode). The external process runs in its own OS process, isolated from the BEAM VM's memory and scheduler. A crash in the Port process is reported to the BEAM via an {exit_status, Code} message; the OTP supervisor tree handles it like any process exit. The BEAM VM itself is unaffected. Ports are the officially sanctioned, documentation-first integration path for non-BEAM code in OTP.

NIFs (Native Implemented Functions) load a shared library (*.so / *.dylib) directly into the BEAM VM's address space. NIF functions run in the BEAM scheduler threads: they block the scheduler, and a crash (segfault, stack overflow, double-free) kills the entire BEAM OS process. NIFs are appropriate for short-duration, latency-critical operations (hashing, encoding, numeric computation) where the <1ms overhead of Port IPC is unacceptable. They are not appropriate for a general-purpose bridge to an arbitrary library: a bug in any bridged function would crash the user's entire BEAM node.

C-nodes implement the Erlang distribution protocol (OTP's erl_interface library) from a C program, making the external program appear as a named Erlang node in the cluster. C-nodes can receive and send Erlang messages, register named processes, and participate in OTP supervision. They are more complex to set up (require EPMD, a cookie, a node name), have higher per-message overhead than a local Port (messages go through TCP), and require the dist network capability. The C-node path is appropriate for phase 13 (distributed Erlang bridge), not for the core consume direction.

The decision: Port is the default. Every phase 0-12 function uses Port IPC. Phase 13 (distributed bridge) uses C-node via the erl_interface library in Go (package3/erlang/cnode/). A future sub-phase (MEP-66 N.1) may offer a NIF opt-in for [erlang.nif = true] annotated imports, but this is not part of the core spec.

Why ok/error maps to result

The {ok, T} | {error, Reason} 2-tuple return pattern is not merely common in Erlang; it is the universal contract for every function that can fail. The Erlang standard library uses it for file I/O, socket operations, gen_server calls, ETS lookups, timer management, and every other domain. It is the dynamic-typing equivalent of a Result<T, E> algebraic data type, established by convention across 30+ years of OTP code. Refusing to map it to Mochi's result<T, string> would make the bridge practically unusable: a user importing hackney:get/4 (which returns {ok, StatusCode, Headers, ClientRef} | {error, Reason}) would get a SkipReport for the function's return type and no usable binding. The structural recognition is conservative: the bridge only applies the mapping when it sees a 2-element union where exactly one branch is {ok, T} (or bare ok) and the other is {error, _} with an _ that is atom(), binary(), or a union thereof. More complex error shapes (e.g., {error, {http, StatusCode}}) produce a SkipReport rather than a lossy translation.

Why Hex.pm trusted publishing only

Long-lived API tokens (HEX_API_KEY) are the leading cause of Hex.pm account compromise. Several high-profile supply-chain attacks in the npm and PyPI ecosystems involved leaked long-lived tokens from CI configuration files, accidentally-committed dotfiles, or compromised CI service accounts. Hex.pm launched trusted publishing in 2024 specifically to eliminate this attack surface. The bridge was designed after those lessons and does not provide a HEX_API_KEY path. Users who need to publish from non-OIDC environments (a local dev machine, a non-GitHub CI system) must either configure their CI provider's OIDC endpoint (GitLab CI, Buildkite, and CircleCI all support OIDC id-token: write equivalent) or use mochi pkg publish --dry-run to produce the package artifact and upload it manually via rebar3 hex publish with a scoped, short-lived Hex.pm user token.

Why rebar3

rebar3 is the canonical Erlang build tool since 2015, recommended by the Erlang/OTP team, and the tool that Hex.pm's own rebar3 hex plugin integrates with. The alternative of writing a custom resolver in Go would require re-implementing rebar3's SAT resolver (which handles complex OTP application dependency graphs including override directives, git sources, umbrella apps, and hex registry lookups), understanding every edge case of the .app.src.app expansion, and tracking the Hex.pm HTTP API. The bridge takes the same approach as MEP-57 for Go and MEP-76 for Bundler: own the manifest layer, generate the build-tool config, and delegate resolution and compilation to the ecosystem's canonical tool.

Cross-references