MEP 51. Mochi-to-Python transpiler: typed CPython 3.12 source, asyncio.Queue + TaskGroup agents, uv + hatchling wheels, ipykernel for Jupyter, PyPI Trusted Publishing
| Field | Value |
|---|---|
| MEP | 51 |
| Title | Mochi-to-Python transpiler |
| Author | Mochi core |
| Status | Draft |
| Type | Standards Track |
| Created | 2026-05-23 11:51 (GMT+7) |
| Depends | MEP-4 (Type System), MEP-5 (Type Inference), MEP-13 (ADTs and Match), MEP-45 (C transpiler, IR reuse), MEP-46 (BEAM transpiler, IR reuse), MEP-47 (JVM transpiler, IR reuse), MEP-48 (.NET transpiler, IR reuse), MEP-49 (Swift transpiler, IR reuse), MEP-50 (Kotlin transpiler, IR reuse) |
| Research | ~/notes/Spec/0051/01..12 |
| Tracking | /docs/implementation/0051/ |
Abstract
Mochi today ships vm3 (mochi run), an ahead-of-time C transpiler producing native single-file binaries (MEP-45), an Erlang/BEAM transpiler producing supervised concurrent runtimes (MEP-46), a JVM transpiler that emits Java bytecode directly via ASM and reaches Maven Central and Project Loom (MEP-47), a .NET transpiler that emits C# source and reaches NuGet and NativeAOT (MEP-48), a Swift transpiler that emits Swift 6.0 source and reaches the App Store and Static Linux SDK binaries (MEP-49), and a Kotlin transpiler that emits Kotlin 2.1 source and reaches Google Play, Kotlin Multiplatform, and Kotlin/Native (MEP-50). None of these paths reaches the Python ecosystem: PyPI (over 600,000 packages as of 2026 Q1), the NumPy / pandas / scikit-learn / PyTorch / JAX numerical-and-ML stack, the FastAPI / Flask / Django web tier, the Jupyter notebook ecosystem (millions of notebooks indexed on GitHub), and the dominant scripting position in scientific computing, bioinformatics, data engineering, and ML research. Python is the lingua franca of data. MEP-51 specifies a seventh transpiler pipeline that emits typed CPython 3.12+ source, drives pyproject.toml-shaped projects through uv 0.4+ and the hatchling PEP 517 backend, and publishes wheels and source distributions to PyPI with Trusted Publishing (OIDC + sigstore + PEP 740 attestations).
The pipeline reuses MEP-45's typed-AST and aotir IR, plus the monomorphisation, match-to-decision-tree, closure-conversion, exhaustiveness-check, and sendability-inference passes shared with MEP-46, MEP-47, MEP-48, MEP-49, and MEP-50. It forks at the emit stage: instead of emitting ISO C23 (MEP-45), Core Erlang via cerl (MEP-46), JVM bytecode via ASM (MEP-47), C# source via Roslyn SyntaxFactory (MEP-48), Swift 6.0 source via a Mochi-side syntax tree (MEP-49), or Kotlin 2.1 source via KotlinPoet (MEP-50), it emits Python 3.12+ source text by lowering aotir to a Python AST node tree and serialising via the stdlib ast.unparse (PEP 657-friendly), then running ruff format for layout normalisation and ruff check --fix for import sorting and unused-import removal. Five packaging targets ship together: --target=python-wheel produces a pyproject.toml-driven wheel (.whl) under dist/, runnable on any CPython 3.12+ interpreter; --target=python-sdist produces the corresponding source distribution (.tar.gz); --target=python-app produces a runnable python -m <package> entry-pointed app with a console-scripts entry; --target=python-ipykernel produces a Jupyter kernelspec under ~/.local/share/jupyter/kernels/mochi-<pkg>/ that transpiles cells on the fly and executes them in an in-process Python subprocess; --target=python-source emits Python source plus pyproject.toml without invoking the build (for downstream uv build and IDE integration). One codegen pass feeds all five targets.
The master correctness gate is byte-equal stdout from the produced wheel (installed into a fresh venv via uv pip install and run with python -m <package>), the produced ipykernel (driven by jupyter nbconvert --to notebook --execute against a generated .ipynb per fixture), and the produced source-only target (run via python -m), versus vm3 on the entire fixture corpus, across CPython 3.12.0 and CPython 3.13.0, on x86_64-linux-gnu, aarch64-linux-gnu, aarch64-darwin, and x86_64-windows. vm3 is the recording oracle for expect.txt; the transpiler does not link against or depend on vm3. Two static-analysis gates run alongside: mypy --strict --python-version=3.12 and pyright --strict must emit zero diagnostics on every emitted source file. Lowering choices are constrained to the intersection of "what both checkers accept as fully-typed". This narrows the emitter (for example: never re-use a type variable across scopes, always declare PEP 695 type parameters explicitly, never rely on implicit Optional from =None defaults), but the result is Python that any typed-Python codebase reviewer would accept as production-strength.
Five load-bearing decisions:
-
CPython 3.12 floor. CPython 3.12 (released 2023-10-02) is the first release with PEP 695 type-parameter syntax (
type Vec[T] = list[T],def f[T](x: T) -> T), PEP 698@override, PEP 669sys.monitoring, the per-interpreter GIL (foundation for the 3.13 free-threaded build),typing.override,asyncio.TaskGroupfrom 3.11+, andtomllib. 3.12 is the floor everywhere typed-Mochi needs to work. 3.11 lacks PEP 695; 3.10 lacks PEP 654 ExceptionGroup; 3.9 reaches end-of-life in 2025-10 and is below the floor by the time MEP-51 v1 ships. CPython 3.13 (2024-10) is supported but not required; its free-threaded (--disable-gil) build is a forward note (12-risks §F1), not a gate. PyPy, Cython, mypyc, Nuitka, and Pyodide are all rejected for v1 (see 12-risks-and-alternatives §A1-§A5). See 02-design-philosophy §3, 06-type-lowering §1. -
mypy --strict AND pyright --strict as compile-gates. Both type checkers run on Mochi-emitted code with no
Anyleakage. Two checkers because their inference differs onProtocol,TypedDicttotality, generic variance, PEP 695 narrowing, andSelftype. Passing both narrows the emitter to the intersection of correct typed Python. mypy 1.13+ is the reference; pyright 1.1.380+ catches narrowing bugs mypy misses (and vice versa). Both run with--strictflag. Lowering choices follow from this: always emitfrom __future__ import annotations(lazy evaluation, no runtime cost), never re-use type vars across scopes, never rely onOptionaldefaults, always annotate every public function and every dataclass field. See 02-design-philosophy §4, 06-type-lowering §2, 11-testing-gates §2-§3. -
asyncio.Queue + TaskGroup, not Trio, not AnyIO. asyncio is stdlib;
asyncio.TaskGroup(3.11+) gives structured concurrency identical in shape to Kotlin'scoroutineScope, Swift'swithTaskGroup, and C#'sTask.WhenAllwith cancellation. PEP 654ExceptionGroupgives multi-failure aggregation when sibling tasks fail concurrently. Mochi agents lower one-to-one: an agent declaration becomes a Python class with aself._mailbox: asyncio.Queue[Message]field and a receive loop launched in the constructor via the enclosingTaskGroup.create_task(self._loop());cast(msg)isself._mailbox.put_nowait(msg)from a sync function orawait self._mailbox.put(msg)from anasync def;call(req)constructs anasyncio.Future[Reply]carried as a field on the message dataclass and awaits it. Trio gives stronger structured-concurrency guarantees but is a hard dep and splits the ecosystem (FastAPI, httpx, aiohttp, SQLAlchemy async all assume asyncio). AnyIO is a runtime-agnostic adapter that adds a layer for no v1 benefit; if v2 wants Trio compatibility, the runtime adapter ships then. See 02-design-philosophy §5, 09-agent-streams §1-§12. -
Reuse MEP-45's
aotirIR plus all shared passes. The IR is target-agnostic; monomorphisation, match-to-decision-tree, closure-conversion, exhaustiveness checking, and sendability inference all run once and feed seven backends. The fork is at the emit pass:transpiler3/python/lower/lowersaotirto a Python AST tree (ast.Modulecontainingast.ClassDef,ast.FunctionDef,ast.AsyncFunctionDef,ast.Match, etc.);transpiler3/python/emit/runsast.unparseplusruff formatplusruff check --fix --select=I,F401(import sorting and unused-import removal). Sharing the IR keeps the seven targets semantically aligned and amortises pass-implementation work. See 05-codegen-design §5. -
uv as canonical build driver, hatchling as PEP 517 backend.
uv0.4 (released 2024-09) replacespip,pip-tools,virtualenv,pyenv, andpip-compilewith a single Rust binary that is 10x-100x faster on cold and warm cache.pyproject.tomldeclares[build-system] requires = ["hatchling>=1.25"]and[project]metadata per PEP 621;uv buildproduces wheel plus sdist;uv publishuploads to PyPI via Trusted Publishing (OIDC token, no API token stored). Nosetup.py, nosetup.cfg, no Poetry, no PDM, no Pipfile. hatchling is preferred oversetuptoolsbecause it is the PyPA-recommended backend for new projects (smaller surface, faster builds, nosetup.pyshim) and over Flit because hatchling supports plugins (used for runtime version stamping) and binary wheels withhatch-fancy-pypi-readmeandhatch-vcs. See 10-build-system §1-§7, 02-design-philosophy §8.
The gate for each delivery phase is empirical: every Mochi source file in tests/transpiler3/python/fixtures/ must compile via the Python pipeline and produce stdout that diffs clean against the expect.txt recorded by vm3. mypy --strict on generated code is the secondary gate. pyright --strict is the tertiary gate. ruff format and ruff check fixed-point (running each twice produces no diff) are the quaternary and quinary gates. Wheel install plus python -m execution is the senary gate. Reproducibility (bit-identical wheel SHA256 across two CI hosts with SOURCE_DATE_EPOCH pinned) is the septenary gate. Jupyter kernel execution (driven by nbconvert) is the octonary gate. PyPI Trusted Publishing dry-run (uv publish --dry-run against a test PyPI mirror) is the nonary gate.
Motivation
Mochi today targets vm3 (for mochi run), MEP-45 (C, single-binary AOT), MEP-46 (BEAM, supervision and hot reload), MEP-47 (JVM, Maven Central via direct bytecode and Loom), MEP-48 (.NET, NuGet and NativeAOT), MEP-49 (Swift, App Store and Static Linux SDK), and MEP-50 (Kotlin, Google Play and Kotlin Multiplatform). None deliver Python:
-
Python is the dominant data-science and ML language. PyPI hosts over 600,000 packages as of 2026 Q1, growing roughly 90,000 per year. The NumPy / pandas / scikit-learn / PyTorch / JAX stack is the canonical numerical and ML surface; approximately half of all new ML research papers (by 2024 arXiv-cs.LG count) publish reference implementations in Python. There is no equivalent on JVM (DJL is small; ONNX runtime is provider-side), on .NET (ML.NET is functional but a fraction of the ecosystem), on Kotlin (KMP-ML is nascent), or on Swift (CoreML is platform-only). A Mochi user with a data-science workflow, an ML training pipeline, or a numerical research project cannot reach the canonical libraries without a Python interop layer. MEP-51 produces idiomatic typed Python indistinguishable from hand-written code after
ruff format, ready to import NumPy, pandas, PyTorch, JAX, scikit-learn, or any other PyPI package directly. -
FastAPI, Flask, Django: the Python web tier. FastAPI (released 2018, with widespread adoption by 2020) is the canonical typed-async Python web framework; its stable shape (Pydantic + Starlette + uvicorn) is the modern Python REST and OpenAPI story. Flask 3.x (2024) and Django 5.x (2024) remain dominant for traditional and full-stack web apps. Server-side Python at PyPI-scale plus FastAPI's typed-coroutine model is a first-class compile target for Mochi: a Mochi agent lowers to an
async defhandler, a Mochi record to a PydanticBaseModelor a frozen dataclass, a Mochi sum type to a tagged-union response. -
Jupyter notebooks. Project Jupyter (originally IPython, since 2014) is the dominant interactive data-science interface; JupyterLab 4.x (2024) is the canonical IDE for notebook-based work. GitHub indexes over 10 million
.ipynbfiles as of 2025. Mochi--target=python-ipykernelproduces a kernelspec under~/.local/share/jupyter/kernels/mochi/that lets a JupyterLab user pick "Mochi" as the kernel and write Mochi cells; under the hood, each cell is transpiled to Python and executed in anIPython.core.interactiveshell.InteractiveShellsubprocess. This is the only path to Jupyter for Mochi (no other MEP-45 through MEP-50 target reaches the notebook ecosystem). -
Typed Python is now production-strength. PEP 484 (2014) introduced type hints; PEP 526 (2016) added variable annotations; PEP 544 (2017) added Protocols; PEP 561 (2017) standardised type-stub distribution; PEP 585 (2020) made built-in collections subscriptable; PEP 593 (2020) added
Annotated; PEP 604 (2020) addedX | Yunion syntax; PEP 612 (2020) addedParamSpec; PEP 621 (2020) standardisedpyproject.toml; PEP 654 (2021) added ExceptionGroup; PEP 657 (2021) added fine-grained traceback offsets; PEP 660 (2021) standardised editable installs; PEP 669 (2023) addedsys.monitoring; PEP 692 (2023) addedTypedDictfor**kwargs; PEP 695 (2023) added thetypestatement and bracket type-parameter syntax; PEP 698 (2023) added@override; PEP 703 (2023) accepted free-threaded Python; PEP 740 (2024) added wheel attestations. By 2026, typed Python is the default for new projects at Google, Meta, Stripe, Dropbox, JetBrains (PyCharm), and Microsoft (pyright); mypy and pyright are mature; FastAPI / SQLAlchemy 2.x / Pydantic 2.x all require type hints. Mochi-emitted Python lands as fully-typed production code. -
pip supply chain is now hardened. PyPI Trusted Publishing (2023-04) accepts OIDC tokens from GitHub Actions, GitLab CI, ActiveState, and Google Cloud, eliminating long-lived API tokens. sigstore signing and PEP 740 attestations (2024-10) give cryptographic provenance per wheel. The 2022-2024 wave of pip supply-chain incidents (typosquats, malicious dependencies) motivated this hardening; MEP-51 publishes through it by default.
-
uv as the canonical build and dependency tool.
uv0.4 (Astral, 2024-09) is the first Rust-native Python toolchain: it replacespip,pip-tools,virtualenv,pyenv, andpip-compilewith a single binary. Resolver speed is 10-100x pip's; deterministic lockfiles (uv.lock) are first-class; the workflow isuv build && uv publish. Poetry, PDM, and Hatch are alternatives but each has friction: Poetry's lockfile format is non-standard, PDM's adoption is smaller, Hatch's matrix scripting is overkill for the single-target Mochi case. uv is the choice. -
The Python platform reaches places no other Mochi target does. Embedded scientific computing (Raspberry Pi, NVIDIA Jetson with PyTorch), bioinformatics pipelines (Biopython, scikit-bio), quantum computing (Qiskit, Cirq), academic research code, ETL pipelines (Airflow, dbt, Prefect), and the entire MLOps tier (MLflow, Weights and Biases, ClearML) are all Python-native. MEP-51 lets Mochi participate.
-
Async Python has matured.
asyncio.TaskGroup(3.11+) gives structured concurrency;asyncio.timeout(3.11+) gives time-bounded scopes;ExceptionandBaseExceptiongroup handling (3.11+) gives multi-failure aggregation. The 2018-era "callback hell orasyncio.gatherplus manual cancellation" era is over. Mochi's agents lower cleanly onto modern asyncio.
The C target (MEP-45) remains the right choice for embedded targets, single-file distribution, and minimal runtime footprint. The BEAM target (MEP-46) remains the right choice for hot-reload services, distributed pubsub, and OTP supervision. The JVM target (MEP-47) remains the right choice for Maven Central, Loom concurrency, and direct-bytecode performance. The .NET target (MEP-48) remains the right choice for NuGet and NativeAOT. The Swift target (MEP-49) remains the right choice for Apple platforms. The Kotlin target (MEP-50) remains the right choice for Android and KMP. The Python target is the right choice for data science, ML, Jupyter, FastAPI, and PyPI. All seven ship; the user picks per workload.
Specification
This section is normative. Sub-notes under ~/notes/Spec/0051/01..12 are informative.
1. Pipeline and IR reuse
MEP-51 shares the front-end and aotir passes with MEP-45 through MEP-50 and forks at the emit stage:
Mochi source
| parser (MEP-1/2/3, reused)
v
typed AST
| type checker (MEP-4/5, reused)
v
aotir IR
| monomorphisation pass (shared)
| closure conversion pass (shared)
| match decision tree (shared)
| exhaustiveness check (shared)
| sendability inference (shared)
v
Python AST codegen pass (transpiler3/python/lower/)
| build ast.Module via stdlib `ast` API
| emit `from __future__ import annotations` header
| emit PEP 695 type aliases, frozen-slots dataclasses, asyncio classes
v
ast.unparse (stdlib) -> Python 3.12 source text
| ruff format (deterministic layout)
| ruff check --fix --select=I,F401 (import sort, unused removal)
v
.py files under src/<pkg>/generated/
| emit pyproject.toml at project root
| emit src/<pkg>/__init__.py
| emit src/<pkg>/__main__.py (entry point)
v
uv build -> dist/<pkg>-<ver>-py3-none-any.whl + dist/<pkg>-<ver>.tar.gz
uv pip install dist/*.whl -> fresh venv
python -m <pkg> -> stdout captured for vm3 differential
| uv publish --trusted-publisher (OIDC + sigstore + PEP 740)
v
PyPI release
Both mochi run (vm3) and mochi build (transpiler3) share the same parser, type system, and IR.
2. Build driver UX
mochi build --target=<TGT> <input.mochi> [-o <output>] [--python=3.12|3.13]
Targets:
python-wheel: produces a wheel (<pkg>-<ver>-py3-none-any.whl) underdist/viauv build --wheel. Universal wheel (no compiled extension by default;mochi_runtime's optional C extension is a separate cp312-abi3 wheel published by the runtime project).python-sdist: produces a source distribution (<pkg>-<ver>.tar.gz) underdist/viauv build --sdist.python-app: produces a runnable app with a console-scripts entry;pip installprovides<pkg>on the user'sPATHandpython -m <pkg>works inside the venv.python-ipykernel: produces a Jupyter kernelspec under~/.local/share/jupyter/kernels/mochi-<pkg>/kernel.jsonplus a smallmochi_kernel.pydriver that transpiles cells and executes them in an in-process IPython shell. Driven viamochi build --target=python-ipykernel <input> --install-kernel.python-source: emits Python source pluspyproject.tomlplussrc/<pkg>/layout without invokinguv build(for downstream IDE integration).
The driver invokes uv directly: uv build for wheel and sdist; uv pip install dist/*.whl for the install gate; uv publish for upload; jupyter kernelspec install for the ipykernel target.
3. Name mangling
Mochi names to Python names:
- Module path
mochilang/aiops/Pipelineto Python packagemochi.user.aiops.pipeline(lowercase per PEP 8; configurable via--python-package-prefix, defaultmochi_user). - Mochi function
process_batchto Pythonprocess_batch(snake_case, preserved; Mochi names already snake_case by convention). - Mochi function
processBatchto Pythonprocess_batch(camelCase converted to snake_case). - Mochi type
User_Recordto PythonUserRecord(PascalCase). - Mochi sum variant
OKto PythonOk(PascalCase dataclass). - Mochi reserved-word collisions:
class,def,lambda,import,from,as,if,else,elif,for,while,try,except,finally,raise,with,yield,return,pass,break,continue,None,True,False,and,or,not,in,is,global,nonlocal,async,await,match,caseare escaped with a trailing underscore (class_,lambda_). - Mochi module-level constants (UPPER_SNAKE_CASE) preserved verbatim.
4. Type lowering
Per 06-type-lowering:
inttoint(Pythonintis arbitrary-precision, matches Mochi).floattofloat(IEEE 754 double).booltobool.stringtostr(UTF-8 internal in CPython 3.12+ per PEP 393; length is code-point count, matching Mochi).bytestobytes.list<T>tolist[T](mutable; insertion order guaranteed since Python list semantics).map<K, V>todict[K, V](insertion order guaranteed since Python 3.7).set<T>to a runtimeOrderedSet[T]wrapper arounddict.fromkeys(Pythonsetdoes not guarantee insertion order; Mochi sets do).mochi_runtime.collections.OrderedSetis the canonical wrapper.record { ... }to@dataclass(frozen=True, slots=True)class.frozen=Truegives hashability and immutability;slots=True(Python 3.10+) gives memory locality matching Mochi's record layout.sum type T = A | Btotype T = A | B(PEP 695 type alias) plus per-variant frozen-slots dataclasses. Exhaustiveness checked bymatch(PEP 634) inside a sealed pattern (acase T()final arm is omitted to force mypy and pyright to flag missing variants).option<T>toT | None(PEP 604 union syntax, since 3.10).result<T, E>to a customMochiResult[T, E]type alias overOk[T] | Err[E](CPython has no stdlib Result; we ship one inmochi_runtime.result).fun(T) -> RtoCallable[[T], R]fromcollections.abc(since PEP 585, nottyping.Callable).async fun(T) -> RtoCallable[[T], Awaitable[R]]orCallable[[T], Coroutine[Any, Any, R]]depending on consumer site (the codegen picks the narrower of the two that satisfies both mypy and pyright).agentto a custom class wrappingasyncio.Queue[Message]plus aTaskGroup-launched receive loop.stream<T>toAsyncIterator[T]fromcollections.abc.timetodatetime.datetimewithtzinfofromzoneinfo.durationtodatetime.timedelta.- Generics: PEP 695 syntax (
def f[T](x: T) -> T,class Box[T]: ...,type Vec[T] = list[T]). NoTypeVarfromtypingin emitted code; the codegen always uses PEP 695.
5. Module and import layout
- Mochi top-level package
mypkg.foo.bar.bazlowers to Python modulemypkg/foo/bar/baz.py. - Each Python module starts with
from __future__ import annotations(PEP 563-style lazy annotation evaluation; no runtime cost for type hints). - Imports are sorted and deduplicated via
ruff check --fix --select=I. __init__.pyfiles re-export the public surface:from .baz import Quux.- The generated package lives under
src/<pkg>/generated/;src/<pkg>/__init__.pyre-exports fromgenerated. src/<pkg>/__main__.pyis the entry point:from .generated.main import main; main().
6. Records as frozen-slots dataclasses
Per 06-type-lowering §5:
from __future__ import annotations
from dataclasses import dataclass
@dataclass(frozen=True, slots=True)
class User:
id: int
name: str
email: str | None
frozen=True gives __hash__, immutability, and free equality. slots=True (Python 3.10+) drops the __dict__ and gives memory locality. kw_only=True is added when Mochi has more than three fields, to avoid positional-argument confusion in calls. __match_args__ is auto-generated by @dataclass, so PEP 634 positional match works.
Field defaults: Mochi defaults lower to dataclasses.field(default=...) for scalars, field(default_factory=...) for mutable defaults.
7. Sum types as PEP 695 type aliases plus dataclass variants
Per 06-type-lowering §6:
from __future__ import annotations
from dataclasses import dataclass
@dataclass(frozen=True, slots=True)
class Some[T]:
value: T
@dataclass(frozen=True, slots=True)
class None_:
pass
type Option[T] = Some[T] | None_
Pattern match:
def unwrap_or[T](opt: Option[T], default: T) -> T:
match opt:
case Some(value=v):
return v
case None_():
return default
Exhaustiveness: mypy and pyright both flag missing arms when the match is over a sealed type alias. No explicit case _: final arm is emitted; the codegen relies on the type checkers to enforce.
8. Closures and higher-order
Per 05-codegen-design §12:
- Mochi closures lower to Python
lambda(for single-expression bodies) or nesteddef(for multi-statement bodies). - Captures are by reference (Python closure semantic). Mochi closures that mutate captures lower to
nonlocaldeclarations. - Higher-order: function-typed parameters use
Callable[..., R]fromcollections.abc. - Async closures (closures that call async functions) lower to nested
async defand the consumer site usesAwaitable[R]orCoroutine[Any, Any, R].
9. Generics via PEP 695
Per 06-type-lowering §9:
def first[T](xs: list[T]) -> T | None:
if not xs:
return None
return xs[0]
class Stack[T]:
def __init__(self) -> None:
self._data: list[T] = []
def push(self, x: T) -> None:
self._data.append(x)
def pop(self) -> T | None:
return self._data.pop() if self._data else None
type Pair[A, B] = tuple[A, B]
No TypeVar from typing is emitted (PEP 695 fully replaces it as of 3.12).
10. Async coloring
Per 05-codegen-design §17, 09-agent-streams §3:
- Mochi async functions (functions transitively reachable from an
awaitor agentcall) lower toasync def. Sync functions stay asdef. awaitlowers to Pythonawait.async forandasync withlower from Mochifor-awaitandwith-await.- Sendability of captured state is checked at the IR level by Mochi's sendability inference; Python's
asynciodoes not enforce this at runtime, so the Mochi-level check is the gate.
11. Error model
Per 06-type-lowering §7, 02-design-philosophy §11:
- Mochi
Result<T, E>toMochiResult[T, E](alias) overOk[T] | Err[E]. Both are frozen-slots dataclasses inmochi_runtime.result. - Mochi throwing function
fun parse() -> AST throws ParseErrorto Pythondef parse() -> MochiResult[AST, ParseError](no exceptions; Python exceptions are reserved forpanicand for boundary FFI). - Mochi
paniclowers toraise RuntimeError(msg). - Multi-failure aggregation from
TaskGroupfailures:ExceptionGroup(PEP 654) is caught at the parent and unwrapped to a Mochi-levelMochiResult.Errcarrying a list of inner errors.
12. Agents
Per 09-agent-streams §1-§12:
from __future__ import annotations
import asyncio
from dataclasses import dataclass, field, replace
from asyncio import Queue, TaskGroup, Future
@dataclass(frozen=True, slots=True)
class IncMsg:
by: int
@dataclass(frozen=True, slots=True)
class GetMsg:
reply: Future[int]
type Message = IncMsg | GetMsg
class CounterAgent:
def __init__(self, scope: TaskGroup) -> None:
self._mailbox: Queue[Message] = Queue()
self._state: int = 0
scope.create_task(self._loop())
async def _loop(self) -> None:
while True:
msg = await self._mailbox.get()
match msg:
case IncMsg(by=n):
self._state += n
case GetMsg(reply=fut):
fut.set_result(self._state)
def cast(self, msg: Message) -> None:
self._mailbox.put_nowait(msg)
async def call_get(self) -> int:
fut: Future[int] = asyncio.get_running_loop().create_future()
await self._mailbox.put(GetMsg(reply=fut))
return await fut
Supervision is nested TaskGroup. On any child failure, all siblings are cancelled (asyncio TaskGroup semantic) and ExceptionGroup is raised at the parent. This matches OTP's one_for_all. one_for_one (per-child restart without sibling cancellation) is custom: wrap the child task in a try / except loop that restarts on Exception (but not BaseException).
13. Streams
Per 09-agent-streams §13:
- Mochi
stream<T>lowers toAsyncIterator[T]fromcollections.abc. - A stream-producing function is an
async defreturning anAsyncIterator[T]; the body usesyield. (CPythonasync defwithyieldmakes an async generator.) - Consume via
async for item in stream:. - Backpressure: bounded
asyncio.Queue(maxsize=N)when Mochi specifies bounded mailboxes; streams default to unbounded.
14. Query DSL
Per 08-dataset-pipeline:
- Mochi
from/where/selectlowers to Python generator expressions or list comprehensions where the query is finite. group_byandorder_bylower toitertools.groupby(aftersorted) andsorted(key=..., reverse=...).- Joins lower to nested comprehensions or
itertools.productplus filter. - Async queries (over streams) lower to
async forwithaiterandanexthelpers frommochi_runtime.stream.
15. Datalog
Per 08-dataset-pipeline §5:
- The Datalog engine lives in
mochi_runtime.datalog. ~800 LOC of pure Python. - Facts are tuples; rules are recursive functions invoking the engine's fixpoint loop.
- Semi-naive evaluation. Cycle detection via tabling.
16. FFI
Per 01-language-surface §10, 06-type-lowering §16:
- Native FFI:
ctypesfor simple C ABI (function pointers, structs, enums) andcffi(1.17+) for richer cases (callbacks, opaque pointer types, headerless declarations). Mochiextern fun foo(...) -> ...from a C module lowers to actypes.CDLL("libfoo.so").foodeclaration plus anargtypes/restypeannotation. For complex APIs, CFFI's ABI mode is used. - Pure-Python FFI: Mochi
extern fun foo(...) -> ...from a Python module lowers to a directfrom <module> import fooplus a typed wrapper. - Type stubs (PEP 561):
mochi_runtimeships apy.typedmarker and stubs for every public API. Mochi-emitted packages ship apy.typedmarker. - Mochi-side exports: Python functions decorated with
@mochi_exportare placed in__all__and surface via the package__init__.py.
17. LLM and fetch
Per 01-language-surface §11-§12:
- Mochi
llm.generate(...)lowers to a call intomochi_runtime.llm.dispatchwhich routes to a provider-pluggable backend (OpenAI, Anthropic, local llama.cpp, etc.) via the provider-dispatch table in 04-runtime §14. - Mochi
fetch(...)lowers tohttpx.AsyncClient().request(...).httpx0.27+ is the canonical async HTTP client (FastAPI default; supports HTTP/2; trio-and-asyncio compatible).requestsis rejected (sync only, no HTTP/2).aiohttpis rejected (heavier, slower release cadence, fewer features). - Provider API keys: read from environment (
OPENAI_API_KEY,ANTHROPIC_API_KEY, etc.), never logged, never persisted. - TLS verification: on by default;
verify=Falseis rejected at codegen (Mochi has no opt-out).
18. Build system, testing gates, and reproducibility
Per 10-build-system and 11-testing-gates:
pyproject.toml(PEP 621) with[build-system] requires = ["hatchling>=1.25"]and[project]metadata.uv buildproduces wheel and sdist.uv publishuploads via Trusted Publishing.mypy --strict --python-version=3.12andpyright --strictrun on every emitted source file.ruff formatandruff check --fix --select=I,F401run for layout and import hygiene.SOURCE_DATE_EPOCHenv var pinned for reproducible wheel SHA. WheelRECORDfile entries sorted lexicographically.- Test gate stack per phase: vm3 byte-equal; mypy strict; pyright strict; ruff format fixed-point; ruff check fixed-point; wheel install plus
python -m <pkg>; ipykernelnbconvertexecution; reproducibility byte-identical wheel SHA across two CI hosts.
Phase plan
Eighteen phases mirroring MEP-45 / MEP-46 / MEP-47 / MEP-48 / MEP-49 / MEP-50:
| Phase | Name | Surface | Target fixtures |
|---|---|---|---|
| 1 | Hello world | print, let, int | 5 |
| 2 | Scalars | int/float/bool/str/bytes ops, comparisons | 30 |
| 3.1 | Lists | list literal, index, len, for-each, comprehensions | 25 |
| 3.2 | Maps | dict literal, index, len, keys, values, has, for-each | 25 |
| 3.3 | Sets | OrderedSet wrapper, add, has, len | 15 |
| 3.4 | Lists of records | list[record], comprehensions over records | 20 |
| 4 | Records | frozen-slots dataclass, equality, copy via dataclasses.replace | 35 |
| 5 | Sum types | PEP 695 type alias plus dataclass variants, exhaustive match | 40 |
| 6 | Closures and higher-order | lambda, nested def, Callable, async coloring | 30 |
| 7 | Query DSL | from/where/select, group_by, order_by, joins | 40 |
| 8 | Datalog | facts, rules, recursion, semi-naive eval | 20 |
| 9 | Agents | asyncio.Queue + TaskGroup, cast, call, supervision | 35 |
| 10 | Streams | AsyncIterator, async generators, async for | 25 |
| 11 | async coloring + MochiResult | async def, await, ExceptionGroup, MochiResult Ok/Err | 30 |
| 12 | FFI | ctypes + CFFI for native; type stubs for pure Python | 20 |
| 13 | LLM | llm.generate (provider-pluggable) | 10 |
| 14 | fetch | httpx.AsyncClient against local test server | 15 |
| 15 | Wheel + sdist | uv build, wheel install, python -m execution | covered by all prior |
| 16 | Reproducible build | SOURCE_DATE_EPOCH + sorted RECORD; byte-identical wheel | covered |
| 17 | Jupyter ipykernel | kernelspec install, JupyterLab cell execution, nbconvert | 25 |
| 18 | PyPI Trusted Publishing | OIDC + sigstore + PEP 740 attestations dry-run | gate-only |
Per 11-testing-gates §18, a phase lands only when all gates are green: per-phase fixture corpus, mypy strict, pyright strict, ruff format fixed-point, ruff check fixed-point, wheel install plus python -m, ipykernel nbconvert for Phase 17, reproducibility for Phase 16, PyPI dry-run for Phase 18.
Rationale
The five load-bearing decisions (§Abstract) flow from a single observation: Python's ecosystem position in data science, ML, and scripting is unique, and the typed-Python tooling (mypy, pyright, ruff, uv) has matured to the point where Mochi can emit fully-typed Python that any production reviewer accepts.
The choice of CPython-only (no PyPy, no Cython, no mypyc, no Nuitka, no Pyodide) follows from the goal of a single semantically-stable target. PyPy's JIT is faster on hot loops but lags CPython by approximately one minor version and has subtle stdlib differences. Cython requires per-function annotations and a build step that defeats the wheel-as-pure-Python story. mypyc compiles typed Python to C extensions but the output is opaque, debug-unfriendly, and adds a per-function call cost. Nuitka and PyInstaller produce frozen-app bundles but are not idiomatic distribution paths for libraries and add MB of overhead. Pyodide compiles CPython itself to Wasm and is the right v2 candidate for browser delivery but adds a 10MB wheel cost and is not a v1 priority.
The choice of both mypy and pyright is forced by their differing inference. mypy is the reference (PEP 484 lead implementation); pyright is faster, IDE-integrated (PyCharm and VS Code), and stricter on Protocol and TypedDict. Their intersection is the subset of typed Python that both accept; emitting into that subset is the safest path. Single-checker reliance has historically caused Mochi-emitted code to subtly misbehave in user IDEs (where pyright runs by default).
The choice of asyncio over Trio follows from ecosystem alignment: FastAPI, httpx, aiohttp, SQLAlchemy 2.x async, Starlette, uvicorn, hypercorn, and the broader async-Python stack all assume asyncio. Trio gives stronger structured-concurrency guarantees (no orphan tasks, no loop.call_later escapes) but is a hard dep that splits ecosystems. AnyIO is a runtime-agnostic adapter, useful for libraries that want to run on both, but Mochi-emitted code is application-shaped, not library-shaped, and the adapter layer is overhead.
The choice of uv over Poetry / PDM / Hatch is forced by speed and standards alignment: uv is 10-100x faster on resolution, has a deterministic uv.lock, uses PEP 621 pyproject.toml directly (no Poetry-specific [tool.poetry]), and supports PEP 517 backends transparently. Poetry's adoption is wide but its lockfile is non-standard. PDM is similar to uv but smaller. Hatch is matrix-heavy and overkill for the Mochi single-target case.
The choice of frozen-slots dataclasses (over Pydantic, attrs, msgspec, NamedTuple, or TypedDict) follows from the goal of zero runtime deps for the record story: dataclasses are stdlib, frozen=True plus slots=True is the canonical immutable record shape, equality and __hash__ are auto-generated, dataclasses.replace gives Mochi's with expression. Pydantic adds validation overhead and a runtime dep; attrs predates dataclasses and has overlapping API; msgspec is a fast serialiser but adds a serialisation focus that is not the v1 priority; NamedTuple lacks slots=True (pre-3.13) and has tuple-positional equality; TypedDict is a dict shape, not a class, and lacks __hash__.
The choice of ipykernel (over a custom JupyterLab extension or a Mochi-as-IPython-magic) follows from the user expectation: Jupyter users select kernels via the kernelspec menu; an ipykernel-shaped Mochi kernel is the lowest-friction path. The kernel transpiles each cell on receipt and executes via an in-process InteractiveShell. State persists across cells via the shell's namespace. Custom JupyterLab extensions add UI surface but require Lab-specific plumbing and re-installation on Lab updates.
Nuitka and PyInstaller are not in v1 because they produce frozen-app bundles that are not idiomatic Python distribution. v2 may add --target=python-pyinstaller for desktop-app shipping.
Backwards Compatibility
MEP-51 is purely additive. mochi run keeps the vm3 path; existing transpiler3 targets (C, BEAM, JVM, .NET, Swift, Kotlin) are untouched. No Mochi language surface changes. The Python target lands as a new transpiler3/python/ subtree and new tests/transpiler3/python/ fixture corpus. Phase gates ensure no cross-target regression. Existing fixtures must produce byte-equal stdout via the Python path. New CLI flags: --target=python-wheel, --target=python-sdist, --target=python-app, --target=python-ipykernel, --target=python-source.
Reference Implementation
Implementation lives under:
transpiler3/python/aotir/: MochiaotirIR consumer (target-agnostic).transpiler3/python/lower/:aotirto Python AST codegen pass.transpiler3/python/emit/: file writer,pyproject.tomlemitter,ast.unparseinvocation, ruff invocation.transpiler3/python/build/: driver foruv build,uv pip install,uv publish,jupyter kernelspec install.runtime/python/mochi_runtime/: PyPI package source.tests/transpiler3/python/: fixture corpus and gate tests.
The transpiler3/python codegen pass is approximately 5000 LOC of Go (a Python-AST-shaped model plus ast.unparse driver and ruff invocation). The runtime library mochi_runtime is approximately 7000 LOC of Python across ~40 files plus an optional ~500 LOC C extension for hot-loop primitives (compiled per ABI3 wheel for cp312-abi3). The fixture corpus targets ~430 fixtures by Phase 18 completion. CI runs on x86_64-linux-gnu, aarch64-linux-gnu, aarch64-darwin, and x86_64-windows across CPython 3.12.0 and CPython 3.13.0. Initial implementation total: approximately 12,000 LOC.
Dependencies
- MEP-4 (Type System), MEP-5 (Type Inference), MEP-13 (ADTs and Match): Mochi front-end.
- MEP-45 (C transpiler):
aotirIR plus shared passes. - MEP-46 (BEAM transpiler): shared IR confirmation, cross-target gates.
- MEP-47 (JVM transpiler): shared IR confirmation, prior art for direct-bytecode (not used here, Python emits source).
- MEP-48 (.NET transpiler): shared IR confirmation, prior art for typed-source emission.
- MEP-49 (Swift transpiler): shared IR confirmation, prior art for typed-source emission with strict checkers.
- MEP-50 (Kotlin transpiler): shared IR confirmation, prior art for KotlinPoet-shaped emission (Python uses stdlib
astinstead).
External:
- CPython 3.12+ (3.12.0 minimum, 3.13.0 supported, 3.14 forward-tracked).
- mypy 1.13+ (the reference type checker).
- pyright 1.1.380+ (the IDE-integrated checker).
- uv 0.4+ (Astral, Rust-native Python toolchain).
- hatchling 1.25+ (PyPA-recommended PEP 517 backend).
- ruff 0.7+ (Astral, Rust-native linter and formatter; replaces black, isort, autoflake, pyflakes, pycodestyle).
- httpx 0.27+ (async HTTP client; runtime dep of
mochi_runtime). - ipykernel 6.29+ (Jupyter kernel infrastructure; only for
python-ipykerneltarget). - jupyter_client 8.6+ (kernel protocol; transitive via ipykernel).
- bundletool not applicable (Android-only, MEP-50 territory).
- sigstore 3.0+ (for PEP 740 attestations during
uv publish).
Runtime deps (in mochi_runtime):
- httpx (for
fetch). - anyio (optional, only via asyncio adapter; no Trio runtime path).
- zoneinfo (stdlib, no extra dep).
- Optional native:
cffi1.17+ for CFFI-mode FFI;ctypesis stdlib.
Open questions
Per 12-risks-and-alternatives §3:
- Q1: Pydantic adapter for Mochi records (zero-cost conversion from frozen-slots dataclass to
BaseModel). v1.5 candidate; depends on adoption signal. - Q2:
__match_args__for positionalmatchover records (the dataclass auto-generates this; question is whether Mochi should also emit named-only matching, more conservative). v1. - Q3: Free-threaded 3.13 gate (CPython 3.13
--disable-gilbuild). The asyncio code path is GIL-bound today; free-threaded changes that. v1.5 to v2 candidate. - Q4: Pyodide bundle for browser delivery. v2 (10MB CPython-on-Wasm cost; tooling separate).
- Q5: mypyc compile pass for hot loops (typed Python to C extension). v2; trades opacity for speed.
- Q6: ipykernel cell-state persistence semantics (reuse module namespace across cells vs reset per cell). v1 defaults to reuse.
- Q7: PyPy compatibility gate as a forward note. v2; resolver lag, stdlib subtleties.
- Q8: Conda / mamba packaging alongside PyPI. v2; conda-forge feedstock submission.
Security considerations
- PyPI supply chain. Trusted Publishing (OIDC) is the only publish path; no long-lived API tokens stored in CI. PEP 740 attestations (sigstore-signed) are emitted with every wheel. The
uv publish --trusted-publisherflow runs in GitHub Actions with theid-token: writepermission. - pickle is never emitted. The
picklemodule's deserialization is arbitrary code execution. Mochi-emitted code usesjson(stdlib) ormsgspecfor serialisation; pickle is forbidden in codegen. - eval and exec are never emitted. Mochi-emitted Python is statically analysable by mypy and pyright.
eval,exec,compile,__import__(dynamic), andglobals()mutation are all absent from the emitter output. - subprocess use is opt-in. Mochi has no language surface for arbitrary subprocess execution by default; FFI declarations must explicitly request it.
mochi_runtime.subprocessis a wrapper with sanitised argv handling and no shell-string interpolation. - ctypes and CFFI use is opt-in. Mochi FFI declarations are explicit; no auto-loading of arbitrary shared libraries.
- LLM provider API keys. Read from environment (
OPENAI_API_KEY,ANTHROPIC_API_KEY,MISTRAL_API_KEY), never logged, never persisted, never serialised into wheel or sdist. - httpx TLS verification. On by default;
verify=Falseis rejected at codegen. - Jupyter ipykernel sandboxing. The ipykernel executes user-provided Mochi code in an in-process IPython shell; this is the same trust model as any Jupyter kernel. Untrusted notebook execution requires the standard JupyterLab security policy (token-protected server, no auto-execute on open).
- Reproducibility.
SOURCE_DATE_EPOCHpinned; wheelRECORDentries sorted; no__file__paths leak the build host's filesystem layout into emitted code.
References
PEPs cited:
- PEP 484. Type Hints. https://peps.python.org/pep-0484/
- PEP 526. Syntax for Variable Annotations. https://peps.python.org/pep-0526/
- PEP 544. Protocols: Structural subtyping. https://peps.python.org/pep-0544/
- PEP 561. Distributing and Packaging Type Information. https://peps.python.org/pep-0561/
- PEP 585. Type Hinting Generics In Standard Collections. https://peps.python.org/pep-0585/
- PEP 593. Flexible function and variable annotations. https://peps.python.org/pep-0593/
- PEP 604. Allow writing union types as X | Y. https://peps.python.org/pep-0604/
- PEP 612. Parameter Specification Variables. https://peps.python.org/pep-0612/
- PEP 621. Storing project metadata in pyproject.toml. https://peps.python.org/pep-0621/
- PEP 634. Structural Pattern Matching: Specification. https://peps.python.org/pep-0634/
- PEP 654. Exception Groups and except*. https://peps.python.org/pep-0654/
- PEP 657. Include Fine Grained Error Locations in Tracebacks. https://peps.python.org/pep-0657/
- PEP 660. Editable installs for pyproject.toml based builds. https://peps.python.org/pep-0660/
- PEP 669. Low Impact Monitoring for CPython. https://peps.python.org/pep-0669/
- PEP 692. Using TypedDict for more precise **kwargs typing. https://peps.python.org/pep-0692/
- PEP 695. Type Parameter Syntax. https://peps.python.org/pep-0695/
- PEP 698. Override Decorator for Static Typing. https://peps.python.org/pep-0698/
- PEP 703. Making the Global Interpreter Lock Optional in CPython. https://peps.python.org/pep-0703/
- PEP 740. Index support for digital attestations. https://peps.python.org/pep-0740/
CPython release notes:
- What's New In Python 3.12. https://docs.python.org/3.12/whatsnew/3.12.html
- What's New In Python 3.13. https://docs.python.org/3.13/whatsnew/3.13.html
Tool documentation:
- mypy. https://mypy.readthedocs.io/
- pyright. https://github.com/microsoft/pyright
- uv. https://docs.astral.sh/uv/
- hatchling. https://hatch.pypa.io/latest/
- ruff. https://docs.astral.sh/ruff/
- httpx. https://www.python-httpx.org/
- ipykernel. https://ipykernel.readthedocs.io/
- sigstore-python. https://github.com/sigstore/sigstore-python
- PyPI Trusted Publishing. https://docs.pypi.org/trusted-publishers/
Sibling MEPs:
- MEP-45 (C transpiler).
- MEP-46 (BEAM transpiler).
- MEP-47 (JVM transpiler).
- MEP-48 (.NET transpiler).
- MEP-49 (Swift transpiler).
- MEP-50 (Kotlin transpiler).
Research notes
Twelve research notes elaborate the design:
- 01-language-surface: Mochi sub-languages mapped onto Python 3.12 lowering obligations.
- 02-design-philosophy: Why Python, why 3.12 floor, why mypy plus pyright, why asyncio over Trio, why uv over Poetry, why hatchling over setuptools.
- 03-prior-art-transpilers: Coconut (functional dialect to Python), Hy (Lisp to Python), Mypyc (typed Python to C), Cython, Nuitka, RPython, Numba, py2nim and rust-to-python efforts.
- 04-runtime: stdlib usage, asyncio, httpx, zoneinfo, dataclasses, mochi_runtime layout, optional C extension.
- 05-codegen-design: Python AST emission via stdlib
ast,ast.unparse, ruff format and check, aotir IR reuse. - 06-type-lowering: Type-by-type mapping to Python 3.12 (frozen-slots dataclass, PEP 695 type alias, Callable, AsyncIterator, MochiResult).
- 07-python-target-portability: Platform matrix (CPython 3.12 and 3.13, Linux x86_64 and arm64, macOS arm64, Windows x86_64), free-threaded 3.13 forward note, PyPy and Pyodide rejected.
- 08-dataset-pipeline: Query DSL lowering via generator expressions plus itertools plus AsyncIterator, Datalog engine in pure Python.
- 09-agent-streams: Mochi agents as a custom class wrapping
asyncio.Queueplus aTaskGroup-launched receive loop, streams asAsyncIterator, structured concurrency via TaskGroup and ExceptionGroup. - 10-build-system: pyproject.toml plus hatchling, uv build, uv publish, PyPI Trusted Publishing, sigstore, PEP 740 attestations, ipykernel install.
- 11-testing-gates: Per-phase Go test gates, CPython version matrix, mypy and pyright strict, ruff fixed-point, wheel install, ipykernel nbconvert, reproducibility.
- 12-risks-and-alternatives: Risk register; PyPy, Cython, mypyc, Nuitka, Pyodide, Poetry, PDM rejected for v1 and why.
The 18-phase delivery plan walks from hello-world through scalars, collections, records, sums, closures, queries, datalog, agents, streams, async coloring, FFI, LLM, fetch, then wheel and sdist, reproducibility, Jupyter ipykernel, and PyPI Trusted Publishing. Each phase is gated against vm3, mypy, pyright, and ruff.