Skip to main content

MEP 8. Test Strategy

FieldValue
MEP8
TitleTest Strategy
AuthorMochi core
StatusActive
TypeProcess
Created2026-05-08

Abstract

Mochi tests the language in five layers, from cheap parser snapshots up through whole-program execution and property tests. This MEP documents each layer, the golden harness shared across them, and the mapping from soundness properties to TAPL chapters that drives fixture authoring.

Motivation

Without a shared layering, contributors write tests at the wrong granularity. A grammar bug should fail at layer 1, not at layer 4. A new type rule should ship with both a valid and an error fixture, not with a single end-to-end check. Naming the layers and what they cover removes a class of "where does this test go" questions.

Specification

Layers

Five layers of testing apply to soundness work, ordered from cheap to expensive.

  1. Parser golden tests. tests/parser/valid/ and tests/parser/errors/. Snapshot the parsed AST as JSON. Runner: parser/parser_test.go.
  2. Type checker golden tests. tests/types/valid/ and tests/types/errors/. Snapshot accepted vs rejected programs and their error output. Runner: types/check_test.go.
  3. Interpreter execution tests. tests/interpreter/valid/ and tests/interpreter/errors/. Snapshot stdout and runtime errors for end-to-end programs. Runner: interpreter/interpreter_test.go.
  4. Bytecode VM execution tests. tests/vm/valid/. Snapshot stdout for programs compiled and run through the VM. Runner: runtime/vm/valid_golden_test.go (tagged //go:build slow). Also produces .ir.out IR dump goldens.
  5. Property tests. tests/types/property/ (planned). Generate programs and assert invariants automatically.

CI runs layers 1–3 by default via go test ./.... Layer 4 is tagged //go:build slow and is not included in the default CI run; use go test -tags slow ./runtime/vm/... to run it locally. Layer 5 is not yet implemented.

Golden harness

golden/golden.go. Each test pair is name.mochi plus a golden file. The harness reads the source, runs the function under test, normalises paths and timing strings, then compares to the golden.

Three entry points:

  • Run: runs all fixtures, fails immediately on each mismatch, sorts alphabetically. Used by parser and type checker tests.
  • RunWithSummary: runs all fixtures, collects pass/fail counts, logs a summary at the end. Stops the suite on the first process error but continues past golden mismatches. Used by transpiler golden suites.
  • RunFirstFailure: runs fixtures in order, stops and reports the first failing program by name. Used when a short first-failure signal is more useful than a full list.

To refresh goldens after a deliberate change, run make update-golden STAGE=types (or the equivalent stage name). This invokes go test ./types -update --vet=off.

Golden file extensions

ExtensionMeaning
.goldenParser or type checker valid test - snapshots the success output or parsed AST
.errParser, type checker, or interpreter error test - snapshots the diagnostic
.outInterpreter or VM execution test - snapshots stdout
.ir.outVM IR dump test - snapshots the compiled bytecode representation

TAPL chapter mapping

Every property in MEP 7 corresponds to a chapter or chapter group in Pierce's Types and Programming Languages. Fixture authors should consult the relevant chapter for the canonical test cases and edge cases.

PropertyTAPL ch.MEPFixture group
Typed arithmetic85numeric
Simply typed lambda95function
Pairs, tuples, sums, lists115, 13adt, list, map
References (mutability)1315mutability
Subtyping1511subtyping
Imperative objects18not used
Recursive types2013recursive
Universal types2312generic
Existential types24future
Type reconstruction2212inference
Bounded quantification26future
Higher kinds30not used

Fixture taxonomy

Fixtures live under tests/types/ and tests/parser/ with one of three suffixes:

  • .mochi plus .golden. A program that should type check or parse successfully. The golden snapshots the success message or the parsed AST.
  • .mochi plus .err. A program that should be rejected at parse or check time. The .err snapshots the diagnostic.
  • .mochi plus .out (interpreter or VM tests). The .out snapshots stdout.

Naming convention. Files are named feature_aspect.mochi. The feature is the language construct. The aspect is the property. Examples:

  • let_basic.mochi for the simple let form.
  • let_immutable_assign.mochi for an error fixture rejecting mutation of a let binding.
  • match_exhaustive_union.mochi (when implemented).
  • subtype_record_width.mochi (when implemented).

Property tests

A property test generates programs from a small grammar and asserts a property holds. The seed grammar will live at tests/types/property/gen.go and produces well-typed expressions over a fixed type set. Properties:

  • Inferred type stability under alpha rename.
  • Check is order-independent on top-level statements that do not reference each other.
  • Adding redundant parens never changes the inferred type.
  • Replacing a sub-expression with its computed value never changes the outer program's type (preservation approximation).

Property tests run with a fixed seed in CI for reproducibility, plus a randomised nightly run.

Required CI

The PR for the v0.11.0 soundness initiative ratchets CI from "ok if tests pass" to:

  • All of layers 1, 2, 3 must pass.
  • Layer 4 (VM golden tests) must pass when run with -tags slow.
  • Layer 5 must run with the deterministic seed and pass once it lands.

How to run locally

# All tests, default seed.
make test

# Just the type checker.
make test STAGE=types

# Refresh goldens after a deliberate change.
make update-golden STAGE=types

# Run the VM golden suite (slow-tagged).
go test -tags slow ./runtime/vm/...

# Run only the soundness property tests.
go test ./tests/types/property/...

Fixture contribution flow

When adding a new fixture pair:

  1. Decide the property it witnesses (MEP 7 list).
  2. Pick a name following the convention.
  3. Write the .mochi source.
  4. For valid fixtures, run make update-golden STAGE=types and inspect the golden.
  5. For error fixtures, run the test and inspect the .err against the diagnostic catalogue in MEP 6.
  6. Commit both files in a single change with a one-line note in the PR body listing the property the fixture covers.

Rationale

A layered approach lets a fast smoke test at layer 1 catch a parser regression before a slow conformance run at layer 4 wastes minutes. Goldens are cheap to read in code review and force authors to look at the output of their change. Property tests are the long tail: they catch the cases we did not think to enumerate.

The TAPL mapping is not academic dressing. It is a practical index into a known canon of test cases. When in doubt about whether a fixture is worth writing, look up the TAPL chapter and check whether the canonical cases are covered.

Backwards Compatibility

Process MEP. No language compatibility implications.

Reference Implementation

  • golden/golden.go, golden harness (Run, RunWithSummary, RunFirstFailure).
  • tests/parser/, tests/types/, tests/interpreter/, layers 1–3.
  • tests/vm/valid/, runtime/vm/valid_golden_test.go, layer 4 (slow-tagged).
  • tests/types/property/, planned property test layer.

Open Questions

  • Property test framework. Whether to write our own generator or adopt Go's testing/quick. The latter is simpler; the former gives us better shrinking.
  • Nightly random run. Where to publish results. We do not have a stable channel yet.
  • Layer 4 in CI. The VM golden suite is slow-tagged and excluded from the default CI run. Either add a separate CI job with -tags slow or gate it on a nightly schedule.

References

  • Benjamin C. Pierce, Types and Programming Languages, MIT Press, 2002.

This document is placed in the public domain.