MEP 8. Test Strategy
| Field | Value |
|---|---|
| MEP | 8 |
| Title | Test Strategy |
| Author | Mochi core |
| Status | Active |
| Type | Process |
| Created | 2026-05-08 |
Abstract
Mochi tests the language in five layers, from cheap parser snapshots up through whole-program execution and property tests. This MEP documents each layer, the golden harness shared across them, and the mapping from soundness properties to TAPL chapters that drives fixture authoring.
Motivation
Without a shared layering, contributors write tests at the wrong granularity. A grammar bug should fail at layer 1, not at layer 4. A new type rule should ship with both a valid and an error fixture, not with a single end-to-end check. Naming the layers and what they cover removes a class of "where does this test go" questions.
Specification
Layers
Five layers of testing apply to soundness work, ordered from cheap to expensive.
- Parser golden tests.
tests/parser/valid/andtests/parser/errors/. Snapshot the parsed AST as JSON. Runner:parser/parser_test.go. - Type checker golden tests.
tests/types/valid/andtests/types/errors/. Snapshot accepted vs rejected programs and their error output. Runner:types/check_test.go. - Interpreter execution tests.
tests/interpreter/valid/andtests/interpreter/errors/. Snapshot stdout and runtime errors for end-to-end programs. Runner:interpreter/interpreter_test.go. - Bytecode VM execution tests.
tests/vm/valid/. Snapshot stdout for programs compiled and run through the VM. Runner:runtime/vm/valid_golden_test.go(tagged//go:build slow). Also produces.ir.outIR dump goldens. - Property tests.
tests/types/property/(planned). Generate programs and assert invariants automatically.
CI runs layers 1–3 by default via go test ./.... Layer 4 is tagged //go:build slow and is not included in the default CI run; use go test -tags slow ./runtime/vm/... to run it locally. Layer 5 is not yet implemented.
Golden harness
golden/golden.go. Each test pair is name.mochi plus a golden file. The harness reads the source, runs the function under test, normalises paths and timing strings, then compares to the golden.
Three entry points:
Run: runs all fixtures, fails immediately on each mismatch, sorts alphabetically. Used by parser and type checker tests.RunWithSummary: runs all fixtures, collects pass/fail counts, logs a summary at the end. Stops the suite on the first process error but continues past golden mismatches. Used by transpiler golden suites.RunFirstFailure: runs fixtures in order, stops and reports the first failing program by name. Used when a short first-failure signal is more useful than a full list.
To refresh goldens after a deliberate change, run make update-golden STAGE=types (or the equivalent stage name). This invokes go test ./types -update --vet=off.
Golden file extensions
| Extension | Meaning |
|---|---|
.golden | Parser or type checker valid test - snapshots the success output or parsed AST |
.err | Parser, type checker, or interpreter error test - snapshots the diagnostic |
.out | Interpreter or VM execution test - snapshots stdout |
.ir.out | VM IR dump test - snapshots the compiled bytecode representation |
TAPL chapter mapping
Every property in MEP 7 corresponds to a chapter or chapter group in Pierce's Types and Programming Languages. Fixture authors should consult the relevant chapter for the canonical test cases and edge cases.
| Property | TAPL ch. | MEP | Fixture group |
|---|---|---|---|
| Typed arithmetic | 8 | 5 | numeric |
| Simply typed lambda | 9 | 5 | function |
| Pairs, tuples, sums, lists | 11 | 5, 13 | adt, list, map |
| References (mutability) | 13 | 15 | mutability |
| Subtyping | 15 | 11 | subtyping |
| Imperative objects | 18 | not used | |
| Recursive types | 20 | 13 | recursive |
| Universal types | 23 | 12 | generic |
| Existential types | 24 | future | |
| Type reconstruction | 22 | 12 | inference |
| Bounded quantification | 26 | future | |
| Higher kinds | 30 | not used |
Fixture taxonomy
Fixtures live under tests/types/ and tests/parser/ with one of three suffixes:
.mochiplus.golden. A program that should type check or parse successfully. The golden snapshots the success message or the parsed AST..mochiplus.err. A program that should be rejected at parse or check time. The.errsnapshots the diagnostic..mochiplus.out(interpreter or VM tests). The.outsnapshots stdout.
Naming convention. Files are named feature_aspect.mochi. The feature is the language construct. The aspect is the property. Examples:
let_basic.mochifor the simpleletform.let_immutable_assign.mochifor an error fixture rejecting mutation of aletbinding.match_exhaustive_union.mochi(when implemented).subtype_record_width.mochi(when implemented).
Property tests
A property test generates programs from a small grammar and asserts a property holds. The seed grammar will live at tests/types/property/gen.go and produces well-typed expressions over a fixed type set. Properties:
- Inferred type stability under alpha rename.
Checkis order-independent on top-level statements that do not reference each other.- Adding redundant parens never changes the inferred type.
- Replacing a sub-expression with its computed value never changes the outer program's type (preservation approximation).
Property tests run with a fixed seed in CI for reproducibility, plus a randomised nightly run.
Required CI
The PR for the v0.11.0 soundness initiative ratchets CI from "ok if tests pass" to:
- All of layers 1, 2, 3 must pass.
- Layer 4 (VM golden tests) must pass when run with
-tags slow. - Layer 5 must run with the deterministic seed and pass once it lands.
How to run locally
# All tests, default seed.
make test
# Just the type checker.
make test STAGE=types
# Refresh goldens after a deliberate change.
make update-golden STAGE=types
# Run the VM golden suite (slow-tagged).
go test -tags slow ./runtime/vm/...
# Run only the soundness property tests.
go test ./tests/types/property/...
Fixture contribution flow
When adding a new fixture pair:
- Decide the property it witnesses (MEP 7 list).
- Pick a name following the convention.
- Write the
.mochisource. - For valid fixtures, run
make update-golden STAGE=typesand inspect the golden. - For error fixtures, run the test and inspect the
.erragainst the diagnostic catalogue in MEP 6. - Commit both files in a single change with a one-line note in the PR body listing the property the fixture covers.
Rationale
A layered approach lets a fast smoke test at layer 1 catch a parser regression before a slow conformance run at layer 4 wastes minutes. Goldens are cheap to read in code review and force authors to look at the output of their change. Property tests are the long tail: they catch the cases we did not think to enumerate.
The TAPL mapping is not academic dressing. It is a practical index into a known canon of test cases. When in doubt about whether a fixture is worth writing, look up the TAPL chapter and check whether the canonical cases are covered.
Backwards Compatibility
Process MEP. No language compatibility implications.
Reference Implementation
golden/golden.go, golden harness (Run,RunWithSummary,RunFirstFailure).tests/parser/,tests/types/,tests/interpreter/, layers 1–3.tests/vm/valid/,runtime/vm/valid_golden_test.go, layer 4 (slow-tagged).tests/types/property/, planned property test layer.
Open Questions
- Property test framework. Whether to write our own generator or adopt Go's
testing/quick. The latter is simpler; the former gives us better shrinking. - Nightly random run. Where to publish results. We do not have a stable channel yet.
- Layer 4 in CI. The VM golden suite is slow-tagged and excluded from the default CI run. Either add a separate CI job with
-tags slowor gate it on a nightly schedule.
References
- Benjamin C. Pierce, Types and Programming Languages, MIT Press, 2002.
Copyright
This document is placed in the public domain.