MEP 4. Type System
| Field | Value |
|---|---|
| MEP | 4 |
| Title | Type System |
| Author | Mochi core |
| Status | Informational |
| Type | Informational |
| Created | 2026-05-08 |
| Revised | 2026-05-11 |
| Requires | MEP 2, MEP 3 |
| Informs | MEP 5, MEP 10, MEP 11, MEP 12 |
Abstract
Mochi has a small type system: a handful of primitive kinds, lists and maps, nominal records and unions, a function arrow, and any as the escape hatch. There is no global solver. Equality, unification, and the numeric ladder are spelled out as code, not derived from a calculus. That makes the system easy to extend and easy to misread, and this document is the place to fix the misreading. The rules below are normative for consumers; the problem list near the end is the audit of what the system claims and what it actually delivers.
Motivation
Most of the surprises in a small type system are not in the kinds, they are in the relations between kinds. Two of Mochi's relations, unify and equalTypes, disagree about whether int is the same type as int64. The function-purity flag is computed but never enforced. Reading a missing map key returns the zero value of the element type, which is fine for int and broken for a struct without a sensible zero. These are not abstract concerns. They are the kind of thing a reader of types/check.go will discover by accident, and a reader of the previous version of this document would not have learned at all.
A type system is a contract between the parser and everything downstream. When the contract is documented vaguely, downstream consumers end up encoding their own assumptions and the assumptions diverge. The point of this MEP is to write the contract down once, in language that survives a refactor.
Terminology
A few words are used throughout with their standard programming-language-theory meaning. Kind means a category of types, in the sense of Pierce, TAPL chapter 29: IntType is a kind, ListType is a kind constructor, and the language has no kind-of-kinds. Nominal types are equal only when their names match; structural types are equal when their shapes match. Mochi's records are nominal, its function arrows are structural. Top is the type every value is convertible to; in Mochi this is any. The judgement Γ ⊢ e : τ reads "under environment Γ, expression e has type τ" and is used in the inference rules below.
One word is being retired. The kind that used to be called void, and is the result type of print and the default for a block-bodied function without a return annotation, is now unit. The rename landed in #21221 and is documented as MEP 3 invariant 15. The old name still appears in target-language transpilers where it is the target language's keyword, not Mochi's type.
The Type interface
Every kind satisfies one Go interface:
type Type interface {
String() string
}
Nothing else. Equality, unification, and subtyping are functions defined over the kind cases, not methods on the interface. This is a trade: adding a kind is a small change in types/check.go (define the struct, add a String method, add cases to unify and equalTypes) while adding a comparison rule touches every kind. The trade favours the change that happens more often, which in Mochi has been kind additions.
Kinds
The catalogue lives in types/check.go and runs from IntType through FuncType.String(). The kinds split into six groups.
Primitive kinds carry no parameters. IntType is the default integer width and prints as int; the runtime stores it in a Go int64 but the surface name stays int. Int64Type exists to give now() a return type that is honestly 64-bit, and to leave the door open if a future port wants to make int cheaper than int64. FloatType is IEEE 754 binary64. BigIntType and BigRatType are arbitrary precision integer and rational respectively. StringType is a UTF-8 byte sequence; the runtime adds rune-aware operations on top. BoolType is two-valued. UnitType, printed as unit, is the result of statement-only operations and the canonical default return for a block-bodied function with no annotation.
Structural kinds carry parameters and are equal by shape. ListType{Elem} is a homogeneous ordered sequence indexable from zero. MapType{Key, Value} is an unordered hash map. OptionType{Elem} is declared and accepted by the type resolver, but is not produced by inference today except for the read path of m[k]. GroupType{Key, Elem} is internal, carried by the into Name clause of a group by query; it shows up in MEP 5 and rarely outside the inference layer.
Nominal kinds are equal by name. StructType carries Name, a Fields []StructField ordered slice that doubles as the iteration sequence and the lookup table (via FieldType/FieldMap/FieldNames helpers), and a Methods map[string]Method table. The slice replaced the previous parallel Fields map[string]Type plus Order []string representation in #21256. UnionType carries Name and Variants map[string]StructType; the variant order from the declaration is captured at type-declaration time but the map itself is unordered (see the problem list).
The function kind is FuncType{Params, Return, Pure, Variadic}. Pure is propagated but not enforced, tracked in #21188. Variadic is the variadic element type and is nil when the function is not variadic; Params is the fixed prefix. The split (rather than the older "last entry of Params is repeated" convention) was the MEP-4 P13 fix, landed in #21253.
The polymorphism kind is TypeVar{Name}, the only kind whose String method has a pointer receiver. It carries a name and exists for the surface syntax of generics; inference does not yet propagate it through call sites. The plan to wire it up is MEP 12.
The top kind is AnyType{}. It prints as any and behaves as a wildcard in unify (in both directions) and as equal-only-to-itself in equalTypes. The split is the largest single source of imprecision in the system and is documented in detail in MEP 11.
A few absences are worth naming explicitly. There is no nominal alias kind; type Id = int collapses to int at resolve time, which means a checker error message cannot say "expected Id, got int". There is no tuple kind; heterogeneous fixed-length sequences either get a generated struct or fall back to [any]. There is no row variable; width subtyping is decided by struct name plus field set. There is no effect kind; purity is a flag, not a row. There is no reference kind; mutability is a property of the binding (var versus let), not of the type. There is no first-class type; types are not values.
The typing context
The environment Γ is types.Env in types/env.go. It is a linked stack of frames. Each frame holds maps for variable types, mutability flags, function declarations, struct types, union types, stream row types, agent declarations, and model aliases, plus a map of runtime values that the static checker should ignore. Lookups walk parents on miss. Scopes close by replacing env with env.parent. The runtime-values map is on the same struct as the static-type map, which is convenient for the interpreter and a layering wart for everything else; the problem list returns to this.
Equality
equalTypes(a, b) in types/infer.go (around line 828) decides whether two types are the same. The rules:
any is equal only to any. If either side is AnyType, the other side must also be. This is the strict reading of any; it is not the reading unify uses.
Composite kinds recurse on their parameters. [T] equals [U] when T equals U; map and option follow the same pattern.
A UnionType is equal to a StructType whose name appears in the union's variants. This is what lets a match arm typed at a variant satisfy a union-typed binding.
The int/int64 carve-out treats the two as equal in both directions, and either of them as equal to itself reflexively. This is the only place in the system where two distinct numeric kinds are conflated at the equality layer rather than only at the operator layer. The carve-out exists because now() returns int64 and code that writes let t = now() would otherwise see a type-mismatch when t is later used in an int context. The cost is that equalTypes and unify disagree on this case; the problem list comes back to it.
Everything else falls through to reflect.DeepEqual, which works because StructType names are unique and FuncType field sets are simple. The fallback is fragile around Methods because Method.Decl is a *parser.FunStmt and DeepEqual will walk the whole AST tree. In practice this has not bitten yet, but it is the kind of bug that surfaces months after it lands.
Unification
unify(a, b, subst) in types/check.go (around lines 144 to 381) is the relation the checker uses to decide whether two types can be reconciled at an argument site or a branch boundary. It is not the same relation as equalTypes; the two were originally written for different jobs and have drifted.
The rules.
any is a wildcard in both directions. unify(any, T) == unify(T, any) == true. This is the escape hatch. A function typed (any) -> any accepts any argument and returns a value the caller has to take on faith. Tightening this without breaking too much existing code is the main job of MEP 11.
*TypeVar either substitutes through subst or, when subst is nil, succeeds only against another *TypeVar with the same name. The substitution carries instantiations across a single unify call; nested calls share the same map so a variable bound at one position is observed at the next.
Composite kinds recurse on their components. [T] unifies with [U] when T unifies with U. Map, option, group, and function arrows follow the same shape.
A StructType unifies with a StructType when their names agree, or either is anonymous, and the field maps match component-wise. A StructType unifies with a UnionType when the struct's name appears in the variant set, in which case it unifies with that variant. Two UnionType values unify only when their names agree.
The numeric kinds form a small subtyping ladder inside unify. IntType unifies with IntType or BigIntType. Int64Type unifies with Int64Type or IntType. BigIntType unifies with the integer family. BigRatType unifies with the entire integer-and-float chain. FloatType unifies only with itself. This is the unification view of the tower; the operator view below is different again. UnitType, StringType, and BoolType are reflexive only.
The shape of unify is "structural equality with any and TypeVar as wildcards", except for the numeric ladder, which is one-directional subtyping written in the same machine. The asymmetry is the second problem on the list and the proximate cause of the disagreement with equalTypes.
The numeric tower
Three relations on numeric kinds coexist and they are not the same:
Equality: int == int int == int64 (the only equal pairs)
Unification: IntType <-> {IntType, BigIntType}
Int64Type <-> {Int64Type, IntType}
BigIntType <-> {BigIntType, IntType, Int64Type}
BigRatType <-> {BigRatType, FloatType, IntType, Int64Type, BigIntType}
FloatType <-> {FloatType}
Operator widening: see the operator table below
The unification view and the operator view diverge on purpose. An int + float widens to float at the operator layer because that is the result of the addition; nothing has been assigned anywhere, so widening is harmless. At an argument site, where the type of the receiving variable is fixed, the same widening would silently drop precision, so unify(IntType, FloatType) is false. This split is principled and worth keeping. The split between equalTypes and unify on int/int64 is not principled and is on the problem list.
Operator typing
inferBinaryType in types/infer.go (around line 72) decides the result type of a binary expression once both operands have been inferred. Precedence is decided by the parser ladder (MEP 2); the table below is per-operator.
Arithmetic operators (+, -, *, /, %):
| Operand shape | Result |
|---|---|
/ with both sides int | int |
One side int64, the other int or int64 | int64 |
Either side bigrat | bigrat |
Either side float, both sides numeric | float |
Either side bigint | bigint |
Both sides int | int |
+ on two lists with equal element type | [T] |
+ on two lists with differing element type | [any] |
+ on two strings | string |
| Otherwise | any |
The list + row that widens to [any] is a soundness gap, #21187. The general "otherwise → any" row is broader than is sound; tightening it is gated on MEP 11.
Comparison (==, !=, <, <=, >, >=) always returns bool. The checker does not currently require the operands to be comparable, so 1 == "x" types as bool rather than producing a diagnostic.
Logical (&&, ||) returns bool when both operands are bool and any otherwise. Membership (in) returns bool when the right side is a list, map, or string and any otherwise. The query set operators (union, union_all, except, intersect) take two lists and produce the list of equal element type, or [any] when the element types disagree.
Type formation
resolveTypeRef(t *parser.TypeRef, env *Env) in types/check.go (around line 1346) maps surface type references into the kind catalogue. The four cases are straightforward.
A Fun reference produces FuncType. Missing return annotation resolves to UnitType per MEP 3 invariant 15.
A Generic reference resolves list<T> and map<K, V> to ListType and MapType. Any other generic name resolves to AnyType. The fallthrough is broader than it should be; an unknown generic name is almost always a typo.
A Struct reference (an inline structural record) produces an unnamed StructType with Name: "". Width-subtyping rules still apply through the unification path.
A Simple reference resolves names against the primitive set first, then walks the environment in this order: unions, structs, type aliases, variant-to-union. A name that finds nothing falls through to AnyType. This is the second silent-typo path and the same fix applies.
The typing judgement
Per-construct rules are in MEP 5. This MEP pins the notation. Γ is the environment of the previous section; e is a parser expression; τ, σ range over types. A handful of representative rules:
Γ(x) = τ
──────────────────── (var)
Γ ⊢ x : τ
Γ ⊢ e₁ : τ₁ Γ ⊢ e₂ : τ₂ op(τ₁, τ₂) = τ
──────────────────────────────────────────── (binop)
Γ ⊢ e₁ op e₂ : τ
Γ ⊢ e : FuncType{Params: [σ₁, ..., σₙ], Return: τ}
Γ ⊢ aᵢ : σᵢ for each i
──────────────────────────────────────────────── (app)
Γ ⊢ e(a₁, ..., aₙ) : τ
Γ, x_i : σᵢ ⊢ body : τ
──────────────────────────────────────────────── (lam)
Γ ⊢ fun(x₁: σ₁, ..., xₙ: σₙ) → body :
FuncType{Params: [σ₁, ..., σₙ], Return: τ}
op in (binop) is the table from the operator-typing section. Γ, x : σ extends Γ with a fresh binding in a new scope. The rules are deterministic in the sense that exactly one rule fires per syntactic form, so a given Γ ⊢ e has at most one type.
Soundness, qualified
The two classical soundness properties hold for the subset of programs that never touch any.
Progress. A closed, well-typed term that is not a value can take a reduction step. Under any, this fails: an any-typed value passed to a position expecting a struct field crashes with a tag mismatch.
Preservation. Reduction preserves the static type. Under any, this is vacuous because the static type is any and the dynamic type can be anything.
Two static checks that the system does not perform: pattern exhaustiveness in match is not enforced (a missing arm fails at runtime, see MEP 13) and the Pure flag is computed but not asserted (a function flagged pure can call an impure one without complaint). Both are tracked.
Problems
The current design has several known problems. They are listed below without rigid categories because the categories overlap (a coherence problem is usually also a soundness problem) and the reader is better served by a flat list with the fix attached.
1. unify and equalTypes disagree on int/int64. equalTypes(int, int64) is true, by a carve-out in types/infer.go around line 866. unify(int, int64) is false; the IntType case in unify accepts only IntType and BigIntType. A consumer that switches between the two relations gets different answers for the same pair. The fix is to remove the carve-out from equalTypes and require an explicit cast where today's code relies on the conflation. The break is small but real and would require sweeping the fixtures in tests/types/valid. A follow-up commit on this PR after MEP 5 is reviewed.
2. unify mixes substitution and subtyping. Most cases of unify are structural equality with any and *TypeVar as wildcards, but the numeric arms are one-directional subtyping written in the same machine. A caller asking "are these the same?" gets "yes" for int and bigint. The fix is the predicate split in MEP 11: unify carries substitutions only and subtype(s, t) carries the numeric ladder.
3. any is a wildcard in unify. This is the same problem from a different angle. Once a value is typed any, every subsequent unification with that value succeeds and propagates any further. MEP 11 replaces the wildcard with a deliberate cast at the boundary.
4. null literal types as any. inferPrimaryType returns AnyType for p.Lit.Null in types/infer.go around line 312, so let x: User = null succeeds. Tying null to OptionType is the MEP 10 A2 fix; the inference should return OptionType{Elem: fresh TypeVar} and the hint should resolve the element type.
5. Missing map key returns the zero of V. m[k] for an absent k returns V's zero rather than option[V]. For types without a sensible zero (a struct variant of a union, for instance) this is a progress violation. The current behaviour is pinned by tests/types/valid/missing_map_key.mochi so a fix is a deliberate break, #21186.
6. List + silently widens to [any]. [1] + ["x"] types as [any] without a diagnostic. The fix is to replace the widening row in the operator table with a type-mismatch error, #21187.
7. Pure is advisory. A FuncType with Pure: true may call impure functions. The flag is propagated but never checked. #21188.
8. Unknown simple type names resolve to any. resolveTypeRef falls through to AnyType when a simple name finds neither a primitive, a struct, a union, an alias, nor a variant. A typo (Usre for User) downgrades the binding to any silently. The fix is a new T0xx "unknown type name" diagnostic; the change is local to resolveTypeRef.
9. Comparisons across kinds. 1 == "x" types as bool. The fix is to require equalTypes(L, R) or both sides numeric in the comparison row of inferBinaryType.
10. StructType.Order and StructType.Fields are redundant. The same fields are stored as a map (for lookup) and a slice (for iteration). They can drift. Fixed in #21256: Fields is now a single []StructField ordered slice and lookups go through FieldType/FieldMap/FieldNames helpers on StructType.
11. UnionType.Variants has no stable iteration order. The text claims declaration order is preserved, but the field is a Go map. Downstream consumers that need stable order (the formatter, the C# emitter's switch table) reconstruct it from the parser tree. The fix is to add an Order []string parallel to StructType.Order.
12. StructType.Methods ties types to parser. Method.Decl is a *parser.FunStmt, so the type system module depends on the parser AST. The fix is to keep only the method signature and an opaque body handle, not the AST node. #21249.
13. Variadic was a bool, not a varargs element type. Fixed. The original FuncType.Variadic was a bool, and the convention "the last parameter is the variadic element" was enforced by the surface grammar rather than by the type. Variadic is now a Type field that carries the variadic element type directly (nil when the function is not variadic) and Params is the strict fixed prefix. #21253.
14. *TypeVar is the only pointer kind. All other kinds are value types with no methods on a pointer receiver. The pointer is there because identity matters: two *TypeVar values with the same name are different variables. Every switch x.(type) has to remember the case is *TypeVar, not TypeVar. The pragmatic fix is to document the rule (polymorphism kinds use pointer receivers; if more arrive, they follow the same shape) once MEP 12 lands and the question becomes relevant.
15. reflect.DeepEqual is the equality fallback. The fallback is used for kinds that have no special case, mostly FuncType and StructType. It walks pointer trees indiscriminately, which makes it sensitive to nested *parser.FunStmt references inside Method.Decl. The fix is an exhaustive switch over the kinds.
16. Env mixes static types with runtime values. The values map on types.Env is a runtime concern that has no business being on a static-analysis struct. The fix is to split types.Env (static) and runtime.Env (values), with the latter embedding the former. The sweep touches the interpreter and the LLM provider boundaries. #21250.
17. Env.Copy flattens the parent chain. Closures capture by copy, but the flattening loses scope-shadowing information that downstream tools need. Worth replacing with a snapshot that keeps the chain.
18. SetFunc lowercases names. types/env.go:262-269 writes both name and strings.ToLower(name) to the function map. The lowercased entry is for case-insensitive FFI lookups; the cost is that SetFunc("Print", f) makes both Print and print resolvable. The lowercase should live at the FFI boundary, not in the env.
19. unit is not a surface type. The kind exists internally and the inference uses it. But the lexer does not accept unit as a type-name keyword, so a programmer cannot write let f: fun() -> unit = .... The fix is one line in the lexer keyword list and one case in resolveTypeRef.
20. No nominal alias kind. type Id = int collapses immediately. Error messages cannot say "expected Id, got int"; recursive aliases cannot be expressed. The fix is AliasType{Name, Target} that prints as Name and unifies as Target. Scope is larger than the other items here and may grow into its own MEP. #21251.
21. No tuple kind. Heterogeneous fixed-length sequences either promote to a generated struct or fall back to [any]. TupleType{Elems []Type} is the obvious addition; it adds unify and equality cases for one more kind. Status: opt-in, deferred. Anonymous struct literals plus parametric polymorphism (MEP-12) cover the observed use cases, and no fixture or user request currently demands tuple syntax. Will be revisited if a concrete need surfaces.
22. No row variable, no first-class type. Both are intentional absences today. Row polymorphism is tracked as a future enhancement to MEP 11; first-class types are not currently planned.
The list is long, but most entries are small. The five worth doing in the near term, in priority order, are 8 (unknown type names), 9 (comparison tightening), 11 (UnionType order), 19 (surface unit), and 10 or 13 (the StructType / Variadic representation fixes; whichever finds a volunteer first). The rest are gated on the type-system roadmap.
Pinned decisions
Two decisions are pinned by code already in the tree. They are recorded here so a future revision does not undo them by accident.
Missing map key. The runtime returns the zero of V today. The fix (option[V]) is a deliberate break tracked in #21186; the current behaviour is fixtured in tests/types/valid/missing_map_key.mochi. Anyone changing the inference rule must update the fixture and announce the break in the corresponding MEP.
unit, not void. The kind was renamed in #21221 to remove the "nil means infer, not void" ambiguity in MEP 3 invariant 15. Transpilers that emit void as a target-language keyword (Go, Java, C, C++, C#, Dart) are unaffected; only Mochi's internal kind name moved.
Reference implementation
| Concept | Location |
|---|---|
Type interface | types/check.go, around line 12 |
| Kind definitions | types/check.go, IntType through FuncType |
unify | types/check.go, around lines 144 to 381 |
equalTypes | types/infer.go, around lines 828 to 876 |
inferBinaryType | types/infer.go, around line 72 |
resolveTypeRef | types/check.go, around line 1346 |
| Environment | types/env.go |
| Purity analysis | types/pure.go |
| Tests | types/check_test.go, types/mep_obligations_test.go, the corpora in tests/types/{valid,errors} |
Line numbers are written as "around N" because they will drift; the function names are stable.
Open questions
A few questions do not yet have answers worth pinning.
The right disposition for float mixed with bigrat is unclear. Today the result is bigrat because precision matters more than performance. An argument can be made the other way: most arithmetic is float and the implicit widening to bigrat is a surprise.
The right surface for generics on builtins is open. len, first, and append are declared with any parameters because MEP 12 is not yet wired through. When it does land, the question is whether the surface syntax requires the type parameters (len[int](xs)) or relies on inference. The standard answer is inference-only with explicit parameters as a fallback, but the parser cost is not zero.
The right answer for resolveTypeRef's Generic fallthrough is probably "error" rather than "any". The same applies to the Simple fallthrough. Both fixes are local; what is not local is the set of test fixtures that today rely on the silent downgrade.
References
Programming language theory
- Pierce, Benjamin C. Types and Programming Languages. MIT Press, 2002. Chapters 8 and 11 for the simply typed core; chapter 15 for the subtyping relation
unifypartially encodes. - Harper, Robert. Practical Foundations for Programming Languages. CUP, 2016. Chapter 4 for the judgement notation used above.
- Cardelli, Luca, and Wegner, Peter. "On Understanding Types, Data Abstraction, and Polymorphism." Computing Surveys 17(4), 1985. The taxonomy behind the
AnyType/TypeVardistinction. - Damas, Luis, and Milner, Robin. "Principal type-schemes for functional programs." POPL, 1982. The Hindley-Milner algorithm
unifyis a small subset of. - Wadler, Philip. "The Expression Problem." 1998. The trade behind the tagged-union kind catalogue.
- Reynolds, John C. "Towards a Theory of Type Structure." Programming Symposium, 1974. For the underlying notion of "kind".
Cross-MEP
- MEP 2 (Grammar): the surface syntax that resolves into the kind catalogue.
- MEP 3 (AST): the tree this MEP types. Invariant 15 (FunExpr default return) is load-bearing here.
- MEP 5 (Inference): the per-construct rules.
- MEP 10 (Known gaps): the wider gap catalogue.
- MEP 11 (Subtyping): the
subtypepredicate that replacesany's wildcard role. - MEP 12 (Polymorphism): the
TypeVarpropagation plan. - MEP 13 (Pattern matching): the exhaustiveness work.
Backwards compatibility
Informational. No behaviour change in this revision. Any future change to the rules above must be announced in its own MEP and cross-linked here.
Copyright
This document is placed in the public domain.