Phase 2. Scalars

Field	Value
MEP	MEP-51 §Phase plan · Phase 2
Status	LANDED
Started	2026-05-29 16:54 (GMT+7)
Landed	2026-05-29 16:54 (GMT+7)
Tracking issue	—
Tracking PR	—

Gate

TestPhase2Scalars: 20 fixtures green on CPython 3.12+ (locally verified against CPython 3.14.5 on Apple Silicon). The tier-1 OS matrix (ubuntu-24.04, ubuntu-22.04, macos-14, windows-2022) and the strict-type-check gates (mypy --strict --python-version=3.12, pyright --strict, ruff format and ruff check --fix --select=I,F401 fixed-point) are carried by the cross-host reproducibility workflow introduced in Phase 16.

Fixtures cover: int arithmetic with floor-division semantics, float formatting including NaN and infinities, bool lowercase output and short-circuit operators, string concatenation, indexing, length, and substring containment under PEP 393 cleanness, casts between scalar types, branching (if/else), for over integer ranges, while with break/continue, plus first-class user functions including recursion. Bytes (sub-phase 2.4) is deferred and not exercised.

Goal-alignment audit

Scalars are the foundation every later phase reads from and writes to. If 1 / 2 lowers to Python 1 / 2 (float division producing 0.5) when Mochi semantics demand integer floor division (producing 0), every arithmetic-heavy fixture from Phase 3 onward silently diverges from vm3. Similarly, print(float('nan')) prints "nan" in Python by default but vm3 prints "NaN"; without a runtime formatter, every float fixture drifts. Phase 2 nails down the per-operator lowering decisions and the runtime formatter so all later phases inherit vm3-byte-equal scalar behaviour for free.

Sub-phases

#	Scope	Status	Commit
2.0	Int arithmetic and comparisons; Mochi `/` on int lowers to Python `//` (floor division)	LANDED	—
2.1	Float arithmetic, NaN / +Inf / -Inf string formatting matching vm3 via `mochi_runtime.fmt.float_str`; zero-divisor routed through `mochi_runtime.math.fdiv`	LANDED	—
2.2	Bool literal and short-circuit operators (`and`, `or`, `not`); lowercase `"true"` / `"false"` print form	LANDED	—
2.3	String concatenation, indexing, `contains`, `len`, code-point semantics under PEP 393	LANDED	—
2.4	Bytes literal, indexing, `decode`, `encode`; deferred (no aotir `TypeBytes` until later phase)	DEFERRED	—

Sub-phase 2.0, Int arithmetic

Goal-alignment audit (2.0)

Mochi int is 64-bit signed; Python int is arbitrary precision. The width difference is benign for now (a 64-bit-fitting value fits any Python int), but the division operator is not: Mochi 1 / 2 returns 0 (integer floor division), Python 1 / 2 returns 0.5 (true division). Phase 2.0 picks // for the int case and is the first place where lowering reads the operand type to choose the operator.

Decisions made (2.0)

Operator mapping for int x int:

Mochi	Python	Notes
`+`	`+`	identical
`-`	`-`	identical
`*`	`*`	identical
`/`	`//`	floor division; Python `/` would return `float`
`%`	`%`	identical (Python's `%` matches Mochi's truncated remainder on positive operands; for negative operands both languages follow the floor convention, no divergence)
`<`, `<=`, `>`, `>=`, `==`, `!=`	same	identical

Emitted source for let r = 7 / 2:

from __future__ import annotations


def main() -> None:
    r: int = 7 // 2

Why not math.floor(a / b): floor-then-truncate adds a float round trip and an import math. The // operator is the direct stdlib idiom and both mypy and pyright accept it as int // int -> int.

Mixed int x float: lowered as int x float -> float. The Mochi type checker rejects mixed arithmetic that would lose precision; only explicit float(x) lowers to float(x). let r: float = 1 + 0.5 lowers with the int operand coerced via float(1) only when the type checker has resolved the result type as float.

Bignum risk: Python int can hold values beyond Mochi's 64-bit signed range. A Mochi program that never overflows at the type level produces no bignum values; FFI ingress through mochi_runtime.int_check(x) (Phase 12) guards the boundary.

Sub-phase 2.1, Float formatting

Goal-alignment audit (2.1)

vm3 prints NaN, +Inf, -Inf, and rounds non-integer floats with the Go strconv.FormatFloat(f, 'g', -1, 64) algorithm. Python's repr(f) agrees on most values but disagrees on infinities (inf vs +Inf) and NaN (nan vs NaN). Phase 2.1 centralises the formatter in mochi_runtime.fmt.float_str so every print(float) site goes through one function.

Decisions made (2.1)

mochi_runtime.fmt.float_str:

from __future__ import annotations

import math


def float_str(value: float) -> str:
    if math.isnan(value):
        return "NaN"
    if math.isinf(value):
        return "+Inf" if value > 0 else "-Inf"
    # vm3 uses Go's strconv.FormatFloat(f, 'g', -1, 64), which
    # picks the shortest round-trippable representation. Python's
    # repr() picks the same shortest representation (since 3.1
    # per the Gay-Steele algorithm). The two agree on every
    # finite value within IEEE 754 double range.
    return repr(value)

Print._format_float (Phase 1.1 stub) is replaced by a delegation to float_str:

@staticmethod
def _format_float(value: float) -> str:
    from mochi_runtime.fmt import float_str
    return float_str(value)

Lazy import inside _format_float avoids a circular import between mochi_runtime.io and mochi_runtime.fmt once the formatter grows additional helpers in later phases.

Operator mapping for float x float: Python +, -, * agree with Mochi directly. The / operator is routed through mochi_runtime.math.fdiv(a, b) because Python's / raises ZeroDivisionError on b == 0.0 while vm3 returns +Inf / -Inf / NaN per IEEE 754. The runtime helper:

def fdiv(a: float, b: float) -> float:
    if b == 0.0:
        if a == 0.0:
            return float("nan")
        return float("inf") if a > 0.0 else float("-inf")
    return a / b

is imported on demand only when the lowerer encounters BinDivF64.

Sub-phase 2.2, Bool

Goal-alignment audit (2.2)

Python's bool is a subclass of int, so True + 1 == 2. Mochi forbids that arithmetic at the type level. The lowerer never emits arithmetic on bool operands. Phase 1.1 already established Print._format returns "true" / "false" for bool; Phase 2.2 fills in and, or, not, and the comparison short-circuit semantics.

Decisions made (2.2)

Operator mapping:

Mochi	Python	Notes
`&&`	`and`	short-circuit, identical semantics
`\|\|`	`or`	short-circuit, identical semantics
`!`	`not`	identical

Emitted source for let r = a && b:

from __future__ import annotations


def main() -> None:
    a: bool = True
    b: bool = False
    r: bool = a and b

Type checker corner: mypy --strict accepts bool and bool -> bool. pyright --strict agrees. Both reject 1 and 2 typed as bool (it is int), so Phase 2 emits explicit bool(...) coercions only when the Mochi type checker resolves a result as bool from non-bool operands (which is forbidden in Mochi anyway).

Sub-phase 2.3, String concatenation and indexing

Goal-alignment audit (2.3)

Mochi strings are code-point sequences; len("naïve") is 5. Python str is the same under PEP 393 internal variable-width storage, and len("naïve") is also 5. The two agree at the language level. Phase 2.3 verifies that concatenation, indexing, slicing, and len produce vm3-byte-equal output, with no UTF-8 byte-level surprises.

Decisions made (2.3)

Operator mapping:

Mochi	Python	Notes
`s + t`	`s + t`	identical
`s[i]`	`s[i]`	indexes a single code point (str of length 1)
`s[a..b]`	`s[a:b]`	half-open slice, identical
`len(s)`	`len(s)`	code-point count, identical

Emitted source:

from __future__ import annotations


def main() -> None:
    s: str = "naïve"
    first: str = s[0]
    rest: str = s[1:]
    n: int = len(s)

Why no UTF-8 conversion: CPython 3.12 stores str in a PEP 393 internal layout (latin-1 / UCS-2 / UCS-4 selected per string) and len counts code points. vm3 also stores strings as code-point sequences and len counts code points. Both agree without an explicit encode("utf-8") round trip.

f"..." Mochi string interpolation lowers to Python f-strings: f"hello, {name}" -> f"hello, {name}". The lowerer emits f"{x!s}" only when x has a non-str type and the formatter must coerce; vanilla {x} is preferred when x is already str.

Sub-phase 2.4, Bytes

Goal-alignment audit (2.4)

Mochi bytes is an immutable byte sequence. Python bytes matches exactly. bytearray is not used (Mochi has no mutable byte buffer in the v1 surface).

Decisions made (2.4)

Operator mapping:

Mochi	Python	Notes
`b + c`	`b + c`	identical
`b[i]`	`b[i]`	returns `int` (byte value 0-255), identical
`len(b)`	`len(b)`	byte count, identical
`b.decode("utf-8")`	`b.decode("utf-8")`	identical
`s.encode("utf-8")`	`s.encode("utf-8")`	returns `bytes`, identical

Bytes literal lowering: Mochi b"hello" lowers to Python b"hello". Mochi bytes([0x01, 0x02]) lowers to bytes([1, 2]).

Emitted source:

from __future__ import annotations


def main() -> None:
    b: bytes = b"hello"
    n: int = len(b)
    s: str = b.decode("utf-8")

Why not bytearray: bytearray would let user code mutate a value passed by another scope, breaking Mochi's value semantics. Mochi has no bytes mutation operator.

Files changed

File	Purpose
`transpiler3/python/lower/lower.go`	Per-operator dispatch reading IR types in `lowerBinaryExpr`; floor-division `//` for `int / int` via `BinDivS64`/`BinDivU64`, true division `/` for `float / float`; bool short-circuit `and`/`or`/`not`; string concat (`+`), index (`s[i]`), slice (`s[a:b]`), `len(s)`, and `in` for `s.contains(t)`. Inline per-binop dispatch reading the aotir operand type; no separate operator table file.
`runtime/python/mochi_runtime/fmt.py`	`float_str(value)` with NaN, +Inf, -Inf formatting matching vm3
`runtime/python/mochi_runtime/io.py`	`Print._format_float` delegates to `mochi_runtime.fmt.float_str`
`runtime/python/mochi_runtime/math.py`	`fdiv(a, b)` IEEE 754 zero-divisor routing (returns `+Inf`/`-Inf`/`NaN`) imported on demand when the lowerer encounters `BinDivF64`
`transpiler3/python/build/phase02_test.go`	`TestPhase2Scalars` walks every `*.mochi` in the fixture directory, comparing the run of the emitted package to the matching `.out`
`tests/transpiler3/python/fixtures/phase02-scalars/`	20 fixture pairs: `arith_add`, `arith_div` (int floor), `arith_float`, `bool_ops` (and/or/not), `break_continue`, `compare_float`, `compare_int`, `compare_str`, `float_nan_inf`, `for_range`, `if_else`, `int_cast`, `let_var`, `str_cat`, `str_contains`, `str_index`, `str_len`, `user_fn`, `user_fn_recursive`, `while_loop`

Test set

TestPhase2Scalars (transpiler3/python/build/phase02_test.go), walks all 20 fixtures in tests/transpiler3/python/fixtures/phase02-scalars/ (carry-over from the MEP-48 phase02-scalars set; bytes deferred per sub-phase 2.4). Verified locally on CPython 3.14.5 (Apple Silicon, total wall time ~3 s). The cross-host matrix on CPython 3.12 and CPython 3.13 plus mypy --strict, pyright --strict, and ruff fixed-point are gated under Phase 16 (cross-OS reproducibility) and Phase 19 (golden-stdout). The carry-forward Phase 1 corpus continues to run unchanged through TestPhase1Hello and is not duplicated under phase02.

Deferred work

int.toString(base=16) and other base conversions, deferred to Phase 12 (FFI exposes int.to_str).
Mutable byte buffers via bytearray, deferred indefinitely (Mochi surface has no construct that needs it).
Float-to-int truncation operator (Math.floor, Math.ceil), deferred to Phase 6 (higher-order, math module surfaces).
String regex match, deferred to Phase 13 (LLM ships re adapter for prompt templating).

Gate​

Goal-alignment audit​

Sub-phases​

Sub-phase 2.0, Int arithmetic​

Goal-alignment audit (2.0)​

Decisions made (2.0)​

Sub-phase 2.1, Float formatting​

Goal-alignment audit (2.1)​

Decisions made (2.1)​

Sub-phase 2.2, Bool​

Goal-alignment audit (2.2)​

Decisions made (2.2)​

Sub-phase 2.3, String concatenation and indexing​

Goal-alignment audit (2.3)​

Decisions made (2.3)​

Sub-phase 2.4, Bytes​

Goal-alignment audit (2.4)​

Decisions made (2.4)​

Files changed​

Test set​

Deferred work​

Gate

Goal-alignment audit

Sub-phases

Sub-phase 2.0, Int arithmetic

Goal-alignment audit (2.0)

Decisions made (2.0)

Sub-phase 2.1, Float formatting

Goal-alignment audit (2.1)

Decisions made (2.1)

Sub-phase 2.2, Bool

Goal-alignment audit (2.2)

Decisions made (2.2)

Sub-phase 2.3, String concatenation and indexing

Goal-alignment audit (2.3)

Decisions made (2.3)

Sub-phase 2.4, Bytes

Goal-alignment audit (2.4)

Decisions made (2.4)

Files changed

Test set

Deferred work