Skip to main content

Phase 3. Records, lists, maps, sets

FieldValue
MEPMEP-45 §Phases · Phase 3
StatusIN PROGRESS
Started2026-05-22 22:07 (GMT+7)
Landed
Tracking issue
Tracking PR

Gate

Records / collections fixture suite (~80 cases) compiles + runs byte-equal vs vm3 on host triple.

Goal-alignment audit

To be written before sub-phase 3.0 starts. Records + collections unlock realistic data-shaping code; without them no useful Mochi program compiles. Aligns.

Sub-phases

#ScopeStatusCommitPR
3.0Record types: struct mochi_R in source field order; field access; record literals; equalityLANDED
3.1list<T>: per-T mochi_list_<T> (i64 / f64 / bool / str); literals [e, ...]; xs[i]; len(xs); append; for x in xsLANDED
3.2map<K,V>: per-(K,V) open-addressing hash table; literals {k: v, ...}; m[k]; len; keys; values; k in m; for k in mLANDED
3.3set<T>: per-T open-addressing hash set; +, -, contains, len, forDEFERRED
3.4Monomorphisation pass + nested collection types: per-record list / map runtimes, nested collections, empty-literal-with-annotation, list / map equalityIN PROGRESS
3.4alist<R> (records as list elements): per-record mochi_list_<R> struct + 4 helpers emitted into the TU prologueLANDED
3.4blist<list<T>> (T scalar primitive: int / float / bool / string): per-inner mochi_list_list_<inner> struct + 4 helpers in the TU prologue; outer literal / indexing / outer + inner len / outer append / nested for-in / pass + return across function boundaryLANDED
3.4cEmpty literal with type annotation (let xs: list<int> = [], let m: map<int,int> = {})NOT STARTED
3.4dlist / map equality (==, !=)NOT STARTED
3.4emap<K, list<V>> (lists as map values)NOT STARTED
3.4flist<map<K,V>> (maps as list elements)NOT STARTED
3.5omap<K,V> (insertion-order map): hash table + parallel insertion-order list (needed by Phase 8)DEFERRED

Decisions made

Phase 3.0 (records)

  • Started: 2026-05-22 22:07 (GMT+7)
  • Goal-alignment audit: Records are the floor for every later phase that touches data. Without type R { ... }, neither Phase 3.1 (list<R>), Phase 3.2 (map<K, R>), Phase 4 (sum types over records), Phase 6 (string/IO with structured input), nor Phase 8 (query DSL row shape) can land. The user-facing goal moves: a Mochi program that defines a struct, instantiates it, accesses fields, and prints can now be AOT-compiled to a single C TU. Aligns.
  • Flat Type enum + parallel RecordName strings. aotir.Type stays an int-backed enum with one new variant (TypeRecord). The record's identity rides on a parallel RecordName string on every IR node that can carry a record value (Param, LetStmt, VarRef, Function.ReturnRecordName, CallExpr.ResultRecordName, FieldAccess.ResultRecordName). Rationale: refactoring Type into a struct would force a touch on every type-compare site in the verifier, lower, and emit; the parallel-string approach changes ~12 lines per carrier instead.
  • Field-order normalisation in the lowerer, not the verifier. lowerStructLit reorders the user-supplied literal arguments into the record-declaration order. The verifier then strictly enforces decl-order (the emit pass relies on this so designated-init args render in struct-field order, keeping the C source byte-deterministic for byte-equal inputs).
  • Pass 0 collects records before Pass 1 collects fn signatures. This means a fun mk(): R can reference type R even when the type decl is below the fun in source order (Mochi has no forward decls).
  • No nested records in 3.0. A record field whose type is itself a record is rejected at lower-time. The IR already plumbs RecordField.RecordName and FieldAccess.ResultRecordName for the eventual unlock; the per-record mochi_eq_<Name> helper has the nested-record branch wired but never reached. Lands properly with 3.0.x once a fixture set exists.
  • No print(record) in 3.0. Whole-record printing requires a per-record formatter (deferred to a later sub-phase if needed). print(r.field) works for any scalar field; that covers every fixture.
  • No field assignment in 3.0. r.f = v is rejected: records are value-semantics, so the surface would mean "rebuild and rebind r", which the syntax does not express. Re-assign the whole binding instead.
  • Equality is structural and field-wise. BinEqRec lowers to mochi_eq_<Name>(a, b), a per-record helper that pairwise-compares each field with the appropriate primitive operator (== for scalars, strcmp for strings). BinNeRec is !mochi_eq_<Name>(...). Cross-record-type comparisons are rejected at verify-time. This also unlocks direct string equality via BinEqStr / BinNeStr (== / != on two string operands lower to strcmp(a,b) == 0 / != 0).
  • C99 designated init for record literals. A Pt { x: 1, y: 2 } literal emits as (struct mochi_Pt){.x = INT64_C(1), .y = INT64_C(2)} (a C99 compound literal). Lifetime: the lowerer always binds the literal to a name (or passes it as an arg), so the storage extends to the surrounding block scope under C99 rules.
  • Records are passed and returned by value. static struct mochi_R foo(struct mochi_R p). No * / heap allocation; the C ABI handles it. This matches Mochi's value-semantics.
  • Field naming clash with Mochi keywords. Mochi reserves words like on (event syntax). Record fields named with such words are rejected by the surface parser, not by the lowerer. Two fixtures originally used on and were renamed to enabled before the record-bool gate would parse.

Phase 3.1 (lists)

  • Started: 2026-05-22 22:38 (GMT+7)
  • Goal-alignment audit: Lists unlock realistic data-flow code: collecting values across loop iterations, holding parsed lines from input, threading results through append in a fold. Without them the surface tops out at scalar arithmetic. Phase 3.2 (map<K, V>) reuses the per-T runtime + monomorphisation pattern that 3.1 introduces; Phase 6 (string/IO) needs list-of-string to model line streams; Phase 8 (query DSL) needs list-of-record as the row sequence. So 3.1 is the smallest step that moves the user-facing goal and the largest pre-req for everything after. Aligns.
  • Scalar element types only in 3.1 (int, float, bool, string). Record and list element types are rejected at lower-time with a Phase 3.1 diagnostic. The IR carries ElemType on every list-bearing carrier (parallel-field pattern, same as Phase 3.0's RecordName); list<R> and list<list<T>> will unlock by widening the lower-time predicate and adding mochi_list_<RecordName> instantiations in Phase 3.4 monomorphisation. The runtime ships only the four scalar instantiations; the i64 path is the reference and the f64 / bool / str helpers are direct copies with the element type substituted.
  • Functional append semantics. xs = append(xs, v) returns a new list with a freshly-malloced buffer and leaves the input list's buffer untouched. Two reasons. First, this matches vm3's OpAppend oracle: vm3 builds a new list value, the AOT-C path must too, or any program that holds an alias to the pre-append list will diverge byte-for-byte. Second, in-place grow with shared backing storage is a Phase 18 (perf) optimisation that requires escape analysis to make safe; we'd rather pay an O(n) memcpy per append now than ship a divergence-prone aliasing model and clean it up later. Empty lists allocate no buffer (data = NULL, len = cap = 0); the first append mallocs a 1-element buffer.
  • Empty list literal [] is rejected. A bare [] has no element-type information for the lowerer to pick a runtime instantiation, and the surface has no annotation form yet. Programs that need an empty list must use let xs: list<int> = [] once Phase 3.4 lands a typed-empty path, or seed with one element. Rejection message: "empty list literal: Phase 3.1 requires at least one element so the element type can be inferred".
  • xs[i] is a runtime bounds check, not a verified static one. The index expression always lowers to mochi_list_<T>_index(xs, i) which compares against len and trips mochi_panic_index() on out-of-range (exit code 4, MOCHI_ERR_INDEX). Negative indices are rejected (vm3 does not have Python-style negative indexing; the AOT path must agree). Slice syntax (xs[a..b]) is rejected in 3.1 with a Phase 3.1 diagnostic; it lands with the string/list slice unification in Phase 6.
  • C99 compound literal for list-lit buffer storage. [1, 2, 3] emits as mochi_list_i64_lit((const int64_t[]){INT64_C(1), INT64_C(2), INT64_C(3)}, INT64_C(3)). The compound literal lives in automatic storage at the enclosing block, then mochi_list_i64_lit memcpys it into a heap-malloced buffer that the list value owns; this keeps the list literal source-deterministic and avoids a per-fixture static initialiser.
  • Per-T runtime now, monomorphisation later. 3.1 ships four hand-written instantiations of the list helpers (one per scalar element type, ~40 LOC each). This is small enough to maintain by hand and avoids dragging the monomorphisation pass forward. Phase 3.4 collects the concrete (op, T) set the program uses and emits exactly the helpers needed; the runtime then drops the unused per-T helpers from libmochi. Until then, every program links against all four instantiations regardless of which it uses; dead-code elimination at the linker handles the bloat.
  • for x in xs is a length-bounded numeric loop. The for-each lowers to a fresh block that captures the list into a __mochi_list_x temp, queries len once into __mochi_len_x, and iterates 0..len with __mochi_i_x, fetching x = mochi_list_<T>_index(__mochi_list_x, __mochi_i_x). The capture + length-cache pattern matches vm3's iterator snapshot semantics (mutation of the underlying list inside the loop body does not change the iteration set). The temp names are mangled with the loop variable to avoid nested-for collisions.
  • Lists are passed and returned by value. static mochi_list_i64 foo(mochi_list_i64 xs). The struct header (data + len + cap) copies; the buffer is shared until the next append allocates a fresh one. This matches Mochi's value-semantics surface and the C ABI handles the by-value struct passing without any heap-alloc hops.
  • Lifetime: leak on program exit in 3.1. No free() calls. The OS reclaims at process tear-down. Phase 7 (error model + panic unwinding) adds a real reclamation path; until then the lists keep their buffers for the full program lifetime. The tradeoff is intentional: a refcounted or arena-backed scheme adds runtime complexity that the fixture suite would not exercise, and shipping the user-facing surface earlier is the goal.

Phase 3.2 (maps)

  • Started: 2026-05-22 23:00 (GMT+7)
  • Gate green: 2026-05-22 23:25 (GMT+7), 22 fixtures passing (TestPhase3Maps).
  • Goal-alignment audit: Maps are the smallest collection that lets a Mochi program model lookup tables (word counts, lookup-by-id, configuration, indices into other collections). Without them every program that needs associative storage has to fall back to two parallel lists + linear scan. Phase 3.5 (omap<K,V>) inherits the entire per-(K,V) runtime + parallel-field plumbing this phase introduces; Phase 8 (query DSL) needs map<K, list<R>> as the group-by accumulator shape; Phase 9 (streams/agents) needs map<id, agent> as the actor-registry shape. So 3.2 is the smallest pre-req that unlocks four downstream phases worth of code. Aligns.
  • Hand-rolled open-addressing, not cwisstable. The original MEP table named cwisstable as the underlying table. We swapped to a hand-written open-addressing + linear-probe + load-factor-0.5 implementation (~120 LOC under MOCHI_MAP_DEFINE) for three reasons. First, no vendored C dep keeps the runtime portable across the eventual cross-target matrix (Phase 11) without picking up <emmintrin.h> or <arm_neon.h> SIMD intrinsics. Second, the SplitMix64 finalizer for i64 keys matches runtime/vm3/maps.go::hashI64 byte-for-byte, so fixtures that count hash-table collisions stay deterministic across the vm3 oracle and the AOT path. Third, ~120 LOC of macro-expanded C beats ~3kLOC of vendored SIMD when the fixtures top out at ~100 entries; cwisstable's perf advantage doesn't appear until Phase 18 perf work, where the table swap is a one-symbol-set substitution behind the same mochi_map_<K>_<V>_* ABI.
  • Eight (K,V) instantiations in 3.2. Keys are restricted to int and string; values to the four scalars (int, float, bool, string). The runtime ships all eight under mochi_map_<K>_<V>_* symbol names; the lowerer rejects record / list keys or values at lower-time with a phase-named diagnostic. Map-of-record and nested maps land with Phase 3.4 monomorphisation, same as the list-of-record path.
  • KeyType + ValueType parallel fields on every map-bearing IR carrier. Same parallel-field pattern as Phase 3.0's RecordName and Phase 3.1's ElemType. Param, Function.ReturnKeyType / ReturnValueType, LetStmt, VarRef, CallExpr.ResultKeyType / ResultValueType, and the six map exprs (MapLit, MapGetExpr, MapHasExpr, MapLenExpr, MapKeysExpr, MapValuesExpr) carry the parallel pair beside Type==TypeMap. The alternative (Type as a struct, or a registry-of-instantiations) would touch every type-compare site in the verifier; the parallel-field approach changes ~20 lines per carrier instead.
  • Sort-on-iteration via insertion-sort snapshot. Both for k in m and keys(m) / values(m) snapshot the live entries into a fresh buffer, then insertion-sort by key (< for i64, strcmp for strings). This matches vm's key-sorted iteration order so byte-equal stdout against the vm3 oracle is achievable. Insertion sort suffices at fixture scale (~100 entries worst case); a merge-sort or heap-sort drop-in is a Phase 18 perf task behind the same snapshot interface.
  • Functional / immutable in 3.2. No m[k] = v, no delete(m, k). Maps are constructed once from a literal (or let m = mk()) and queried thereafter. Mutation lands in a later sub-phase that also brings list mutation; until then m = {...} rebinds the whole map. This matches the vm3 oracle's value-semantics for map literals.
  • k in m is the membership test, no has() surface. The Mochi type-checker recognises in as a binary operator that produces bool when the right operand is a map<K, V>. The lowerer special-cases that pattern in lowerBinary to emit a MapHasExpr instead of a BinaryExpr. We also wire a has(m, k) builtin in the lowerer so future fixtures that prefer call syntax still work; both surfaces lower to the same MapHasExpr node.
  • m[k] panics on missing key in 3.2; no Option yet. A failed lookup trips mochi_panic_index() (exit code 4, MOCHI_ERR_INDEX); programs that must probe first should use k in m and only then index. The Mochi type-checker types m[k] as option<V> on the surface, so any arithmetic that uses m[k] directly fails type-check today; fixtures sum via values(m) or index after a successful in check. Real option-aware unwrap arrives with Phase 4 (option types).
  • for k in m is sugar for for k in keys(m). The lowerer synthesises a MapKeysExpr inline and re-uses the existing ForEachStmt. No new IR node, no new emit-time path; the emitter doesn't even know the loop's source was a map. This keeps the for-each shape stable across collection types.
  • Empty map literal {} is rejected (mirrors list rule). A bare {} has no (K, V) for the lowerer to pick an instantiation, and the surface has no annotated-empty form yet. A future typed-empty path lands with Phase 3.4 once the lowerer threads annotated-binding types into literal-time resolution. Rejection message: "empty map literal: Phase 3.2 requires at least one entry so the key and value types can be inferred".
  • Lifetime: leak on program exit, same as 3.1. No free() for table buffers or the per-call snapshot allocations under keys() / values(). Phase 7 (panic unwinding) adds a real reclamation path. The per-call snapshot allocations are a known leak proportional to call count, accepted for fixture-scale programs.

Phase 3.3 (sets): DEFERRED

  • Audit date: 2026-05-22 23:55 (GMT+7)
  • Goal-alignment audit verdict: DEFERRED. Mochi has no set<T> surface syntax. parser/ast.go has no SetLit or SetType node; types/kinds.go has no SetType (only ListType, MapType, GroupType); the type-checker recognises a distinct(list) builtin that dedupes a list into another list, not a set value. Implementing set<T> in transpiler3 alone would produce a runtime + IR + emit chain that no Mochi program can reach: the parser would reject let s: set<int> = {1, 2} before the lowerer ever ran. The user-facing goal moves by zero, every Phase 3.3 LOC would be unreachable scaffolding. Per the goal-alignment-audit rule we defer.
  • Unblock condition. Adding set<T> to MEP-45's user-facing surface requires Mochi parser + type-checker work that belongs in a different MEP (likely the Mochi language MEP track, not the AOT-backend MEP-45). Once Mochi gains set<T> parsing + a SetType kind, Phase 3.3 reactivates: per-T open-addressing hash set (4 instantiations: i64 / f64 / bool / str), + (union), - (difference), contains(s, e), len(s), for e in s key-sorted iteration. The runtime drops into transpiler3/c/runtime/{include/mochi/set.h, src/set.c} as a value-slot-elided cousin of mochi_map_<K>_<V>_*; the IR + lower + emit chain mirrors Phase 3.2 exactly, swapping the (K,V) pair for a single (T) element.
  • No fixture corpus today. Without a parsable Mochi surface there is nothing to fixture against the vm3 oracle. TestPhase3Sets is not authored; the gate row in the table reads DEFERRED and will not block downstream phases.

Phase 3.4a (list: records as list elements)

  • Started: 2026-05-22 23:55 (GMT+7)
  • Gate green: 2026-05-23 00:25 (GMT+7), 17 fixtures passing (TestPhase3MonoListRecord).
  • Goal-alignment audit: list<R> is the row-sequence shape that Phase 8 (query DSL) and most realistic data programs depend on. Without it the language can manipulate scalars and individual records but cannot accumulate them into the sequences a real workload produces. 3.4a is the smallest step that unlocks that surface: pass list<R> to a function, append onto it, iterate it, return it. The remaining 3.4 sub-phases (b: nested collections, c: typed-empty literal, d: list / map equality) are independent and can land in any order on top of this foundation. Aligns.
  • Per-record helpers emitted into the TU prologue, not libmochi. The scalar list helpers (mochi_list_i64_*, etc.) ship in libmochi because the element type is known up front. For list<R> the element type is a user-defined struct mochi_<R> that only exists inside the program's translation unit; emitting the per-record list runtime into libmochi would require either pre-declaring every conceivable record (impossible) or punching a generic void-pointer hole in the runtime (defeats the type-safety goal of monomorphisation). Instead, the emitter walks the program after emitRecordDecls, collects every unique record name used as a list element across function signatures + bodies, and emits a mochi_list_<R> struct + 4 helpers (_lit, _append, _index, _len) per record into the TU prologue. The helpers are static to keep them TU-local. The shape is a direct port of mochi_list_i64's source, with int64_t replaced by struct mochi_<R>. Same heap-malloced-buffer / value-semantics / leak-on-exit ownership model.
  • ElemRecordName parallel field on every list-bearing IR carrier. Same pattern as Phase 3.0's RecordName and Phase 3.2's KeyType / ValueType. Param, Function.ReturnElemRecordName, LetStmt, VarRef, CallExpr.ResultElemRecordName, ListLit, IndexExpr, LenExpr, AppendExpr, ForEachStmt each carry the parallel string beside Type==TypeList / ElemType==TypeRecord. The verifier strictly enforces that ElemRecordName is non-empty when ElemType==TypeRecord, and that the named record is declared. The alternative (Type as a struct, or threading a per-instantiation registry through every type-compare site) would touch ~50 sites in the verifier; the parallel-string approach adds ~10 lines per carrier.
  • listSuffix returns the record name for list<R>. Scalar suffixes (i64, f64, bool, str) resolve to libmochi exports. The record suffix resolves to the TU-local mochi_list_<R>_* helper family. Everything downstream (literal emit, index emit, len emit, append emit, for-each emit) is record-agnostic at the call site: it queries listSuffix(elem, elemRec) and emits the same shape of helper call regardless of whether T is a primitive or a record.
  • cListBufferElemC extended to TypeRecord. The compound-literal array temporary that backs a [r1, r2, r3] list literal needs a per-record element type; the helper now returns struct mochi_<R> when elem == TypeRecord. The C99 designated-init records emitted by lowerRecordLit slot directly into this temp, so the list-of-record literal renders as mochi_list_<R>_lit((const struct mochi_<R>[]){(struct mochi_<R>){.x=1,.y=2}, ...}, INT64_C(2)) without any extra heap hop.
  • <stdlib.h> is pulled in only when list-of-record helpers are emitted. malloc lives in <stdlib.h>, and the per-record helpers need it. But unconditionally including <stdlib.h> exposes the program to name clashes with stdlib symbols (abs, div, exit); we saw the abs collision in the Phase 2.2 fun_early_return fixture, which defines its own abs. The fix: the emitter calls collectListRecordElems(prog) first, and the #include <stdlib.h> line is only added when that set is non-empty. The scalar list helpers continue to get their malloc from list.c's own translation unit so this header is needed only for the TU-local helpers.
  • isListElemType widened to accept TypeRecord. The lowerer's predicate that gates which element types list<T> may carry now accepts TypeInt, TypeFloat, TypeBool, TypeString, and TypeRecord. Nested list / nested map element types remain rejected with a phase-named diagnostic pointing at Phase 3.4b. The verifier's parallel predicate widened the same way; verifyListLit, verifyIndexExpr, verifyLenExpr, verifyAppendExpr all accept TypeRecord element and require ElemRecordName to be non-empty + declared.
  • exprElemRecordName plumbed through every list-flowing site. A new lowerer helper that mirrors exprElemType: it pulls the elem record name out of whichever Expr variant carries one (VarRef, ListLit, CallExpr, AppendExpr, IndexExpr). The lowerer threads it through assign / return / call-args / for-each / append, and the verifier rechecks the invariant on every carrier. The IndexExpr case is new in 3.4a because list<R> indexing returns a record (the receiver's ElemRecordName becomes the result's RecordName); scalar list indexing returns a primitive so no record name flows through that path.
  • Typechecker gap closure: op.Field in checkPostfix. ps[0].x parses as Selector(ps) + IndexOp(0) + FieldOp(x). The selector-tail fast path in checkPrimary collapses Root.tail field walks into the symbol lookup, but it only fires when the entire chain is selector-prefix. As soon as an Index or Call step lands between the root and the field, the loop in checkPostfix takes over, and that loop had branches for Index / Call / Cast / SafeField / SafeIndex but not for the bare op.Field. The result: the typechecker left typ as the record-typed receiver and the caller saw a type mismatch ("expected int, got Pt"). The fix adds a Field branch that mirrors checkPrimary's StructType / Methods / unknown-field paths plus an AnyType passthrough. The bug went undetected because every prior list-of-record fixture passed ps[0].x to print (which accepts any type and asks the lowerer, not the typechecker, for the C type); only a function with an explicit non-record return type surfaced the gap.
  • Lifetime: leak on program exit, same as 3.1. No free() for the per-record buffers. Phase 7 (panic unwinding) adds reclamation. The per-record helpers are direct copies of the scalar list helpers' ownership model.

Phase 3.4b (list<list<T>>: nested lists of scalars)

  • Started: 2026-05-23 00:43 (GMT+7)
  • Gate green: 2026-05-23 01:21 (GMT+7), 17 fixtures passing (TestPhase3NestedLists).
  • Goal-alignment audit: list<list<T>> is the smallest nested-collection shape that unlocks realistic data programs: matrices, rows-of-columns, group-by accumulators that haven't yet grown a key (Phase 3.4e brings the keyed form). Without it the language can hold scalar-keyed flat lists but cannot model jagged 2-D data or a "list of rows" where each row is itself a sequence. Phase 8 (query DSL) needs the keyed map-of-list form, but the unkeyed list-of-list is the structural floor — 3.4e and 3.4f reuse exactly the parallel-field plumbing this phase introduces. The remaining 3.4 sub-phases (c: typed-empty literal, d: list / map equality, e: map-of-list, f: list-of-map) are independent and can land in any order on top of this foundation. Aligns.
  • Scope: scalar inner element types only. The inner element type T is restricted to the four scalar primitives (int / float / bool / string). list<list<R>> (records in the inner list), list<list<list<T>>> (3-level nesting), and list<list<map<...>>> are rejected at lower-time with phase-named diagnostics. Rationale: the per-inner helper family is parameterised on the inner suffix (one of i64 / f64 / bool / str), so a record inner would need a fresh (outer, R) instantiation that mixes the record-list strategy (TU-local per-record struct) with the nested-list strategy (TU-local per-inner struct). Worth its own gate; deferred to 3.4b.1 when a fixture pulls on it.
  • Parallel-field InnerElemType on every list-bearing IR carrier. Same parallel-field pattern as Phase 3.0's RecordName, Phase 3.1's ElemType, Phase 3.2's KeyType / ValueType, and Phase 3.4a's ElemRecordName. Param.InnerElemType, Function.ReturnInnerElemType, LetStmt.InnerElemType, VarRef.InnerElemType, CallExpr.ResultInnerElemType, ListLit.InnerElemType, IndexExpr.InnerElemType, LenExpr (carried via receiver), AppendExpr.InnerElemType, ForEachStmt.InnerElemType each carry the parallel aotir.Type beside Type==TypeList && ElemType==TypeList. The verifier strictly enforces that InnerElemType is one of the four scalar primitives when ElemType==TypeList, and zero (TypeUnknown) otherwise. The alternative (Type as a struct, or threading a per-instantiation registry through every type-compare site) would touch every type-compare site in the verifier and lower; the parallel-field approach adds one field per carrier and is mechanically a copy of the 3.4a path.
  • IndexExpr.ElemType semantic: "produced value type". For a list<list<int>> receiver, xs[0] produces a list<int> value, so IndexExpr.ElemType is TypeList and InnerElemType is TypeInt. This differs subtly from VarRef.InnerElemType (which is "the T inside list<list<T>>" for the variable's declared type); the disambiguation is that IndexExpr.ElemType describes the produced value, not the receiver's element shape. The lowerer's exprInnerElemType helper gates InnerElemType extraction on producedElem == TypeList so callers (lowerLenCall, lowerAppendCall, lowerForEach) don't accidentally treat a flat list's ElemType==TypeInt (with InnerElemType==TypeInt per IndexExpr's "T is element" convention) as a nested list.
  • Per-(outer, inner) TU-local helpers, not libmochi. Same rationale as Phase 3.4a's per-record helpers: the inner element type is mochi_list_<inner> (a libmochi struct), but the outer carrier mochi_list_list_<inner> is a composite that only exists inside the program's TU once it actually appears. Emitting one of every (outer, inner) combination into libmochi would multiply the runtime surface by 4 with no payoff. Instead, the emitter walks the program after emitListRecordHelpers, collects the set of inner scalar types actually used in list<list<T>> contexts (via collectListListInners), and emits a mochi_list_list_<inner> struct + 4 helpers (_lit, _append, _index, _len) per inner type into the TU prologue. The helpers are static to keep them TU-local. Shape is a direct port of mochi_list_i64's source with the element type substituted by mochi_list_<inner>.
  • listSuffix widened to return "list_<inner>" for nested lists. Scalar suffixes (i64, f64, bool, str) resolve to libmochi exports. The record suffix (3.4a) resolves to TU-local mochi_list_<R>_*. The nested-list suffix list_<inner> resolves to TU-local mochi_list_list_<inner>_*. Everything downstream (literal emit, index emit, len emit, append emit, for-each emit) is shape-agnostic at the call site: it queries listSuffix(elem, elemRec, innerElem) and emits the same shape of helper call regardless of whether T is a primitive, a record, or another list. The scalarListInnerSuffix(inner) helper handles the inner-suffix mapping for the per-helper emit.
  • cListBufferElemC widened to nested lists. The compound-literal array temporary that backs an [[1, 2], [3]] outer literal needs a per-(outer, inner) element type; the helper now returns mochi_list_<inner> when elem == TypeList. Each inner list literal renders as mochi_list_<inner>_lit(...) and lifts directly into the outer compound-literal array, so the full nested literal renders as mochi_list_list_i64_lit((const mochi_list_i64[]){mochi_list_i64_lit((const int64_t[]){1, 2}, 2), mochi_list_i64_lit((const int64_t[]){3}, 1)}, 2) without any extra heap hop.
  • Foreach induction-var C type emitted directly via scalarListInnerSuffix. When iterating for row in mat where mat : list<list<T>>, the induction var row has C type mochi_list_<inner> (the inner list type), NOT mochi_list_list_<inner> (which would describe the outer carrier). The default cTypeFull(TypeList, "", innerElem, ...) path renders the outer C type by composing the suffix, so the emit foreach path special-cases s.ElemType == TypeList and computes elemC = "mochi_list_" + scalarListInnerSuffix(s.InnerElemType) directly. The bug this avoids: emitting mochi_list_list_i64 row; and trying to assign mochi_list_i64_index(__mochi_list_row, ...) into it, which the C compiler correctly rejects as an incompatible-pointer-target.
  • Aliasing of inner buffers is intentional and benign in 3.4b. The outer-list helpers do a header-only memcpy (each element is a mochi_list_<inner> struct value, which itself holds a heap data pointer + len + cap). That means an outer literal [a, b] where a and b are pre-bound inner lists shares the inner data buffers between the outer's elements and the original a / b bindings. Cross-mutation (appending to a after embedding it in the outer list) would diverge from vm3's eager-copy semantics. Phase 3.4b's fixtures are read-only at the inner level, so no fixture exercises this divergence; a proper deep-copy on outer-literal construction lands when a fixture pulls on it (or with Phase 18 perf when escape analysis lets us elide it where safe).
  • <stdlib.h> include condition widened to also cover nested lists. Same condition as 3.4a: the per-(outer, inner) helpers need malloc, and that lives in <stdlib.h>. The include is added only when at least one of collectListRecordElems(prog) or collectListListInners(prog) is non-empty, to keep the Phase 2.2 fun_early_return fixture (which defines its own abs) free of name clashes with stdlib symbols.
  • Lifetime: leak on program exit, same as 3.1. No free() for the per-(outer, inner) buffers. Phase 7 (panic unwinding) adds reclamation. The per-inner helpers are direct copies of the scalar list helpers' ownership model with the element type substituted.
  • Reject-test surface unchanged for the cases that still reject. Phase 3.1's let xs = [[1, 2], [3]] rejection moves from "Phase 3.4" to "accepted by 3.4b"; the nested_list row in lower_reject_test.go is removed in this phase. The empty-list-literal rejection (let xs = []) remains, deferred to 3.4c. The list-of-record element rejection (which 3.4a unlocked) remains accepting. No other reject-test rows shift.

Phase 3.5 (omap): DEFERRED

  • Audit date: 2026-05-22 23:55 (GMT+7)
  • Goal-alignment audit verdict: DEFERRED. Same blocker as 3.3: Mochi has no omap<K, V> surface. There is no OmapLit, no OmapType, no insertion-order-map literal syntax that distinguishes from the existing map<K, V> (which is key-sorted in the vm3 oracle). Implementing omap<K, V> in transpiler3 alone would be unreachable scaffolding for the same reason.
  • Unblock condition. When Mochi gains omap<K, V> parsing + an OMapType kind (likely driven by Phase 8 query DSL's group-by row shape), Phase 3.5 reactivates: open-addressing hash table for O(1) lookup + a parallel insertion-order list for iteration (the only difference vs Phase 3.2's map<K, V>). Eight (K,V) instantiations matching Phase 3.2.
  • Phase 8 ripple effect. If the Phase 8 query DSL's group-by row needs insertion-order iteration, Phase 3.5 must reactivate before Phase 8 ships. The Phase 8 spec doc should call this out when it begins. Until then 3.5 stays DEFERRED with no fixture corpus.

Deferred work

Concurrent-safe maps: not in v1 (use a stream/agent boundary).

  • Phase 3.0: print(record), field assignment (r.f = v), nested-record fields, generic record decls, methods on records. All have unblocked paths in the IR (no struct shape changes needed) and will land as 3.0.x sub-phases or with Phase 4.
  • Phase 3.1: list<R> (landed with Phase 3.4a); list<list<T>> for T scalar (landed with Phase 3.4b); list<list<R>> and 3-level nesting (deferred to 3.4b.1 once a fixture pulls on it); print(list) (deferred until a per-T list formatter lands); list slice xs[a..b] (deferred to Phase 6); list equality xs == ys (deferred to 3.4d); in-place mutation aliasing model (deferred to Phase 18 perf); empty-list literal with type annotation (deferred to 3.4c).
  • Phase 3.2: map<K, R> / map<K, list<V>> / map<R, V> (deferred to 3.4 monomorphisation); m[k] = v and delete(m, k) mutation (deferred along with list mutation); print(map) (deferred until a per-(K,V) map formatter lands); map equality m == n (deferred to 3.4); option-aware unwrap of m[k] for arithmetic (deferred to Phase 4 option types); empty-map literal with type annotation (deferred to 3.4); per-call snapshot allocations under keys() / values() accepted as a known leak until Phase 7 reclamation.

Closeout notes

Fill in after gate green.