Changelog¶
All notable changes to musil are documented here. Versioning is SemVer.
[Unreleased]¶
Changed — TCAS II example revised to protocol-accurate coordination model¶
-
examples/tcas.py— four scenarios (was three). VRC constraint broadcast now modeled accurately:Aircraftgains avrc_receivedfield carrying the Mode S coordination constraint ("no-climb"/"no-descend"). The intruder's RA is forced by the constraint it received from own, not computed symmetrically. This matches the actual UF=16/DF=16 Mode S coordination protocol (BDS register 0x30, bits MB:23–26; RTCA DO-185B / ICAO Annex 10 Vol. IV). -
New scenario 4 — v7.1 reversal + Tenerife.
reversal_modeladds theown:ra=reversedsystem action (DO-185B v7.1 mechanism: own reverses when intruder has not maneuvered within the timing window; one reversal per encounter).tenerife_envinjectsintruder:maneuvers-with-old-ra(intruder executes its original RA while re-coordination is incomplete).check_openfinds the counterexample in 7 steps and surfacesINTRUDER_FOLLOWS_REVERSALas a residual obligation. Models the 2011 Tenerife near-collision (CIAIAC A-032/2011; Thomas Cook / Finnair). -
docs/paper.md— § 5.4 added: TCAS II as a fourth evaluation case study, including connection to Lynch's Hybrid I/O Automata model (Livadas, Lygeros & Lynch 2000) and the ACAS X formal verification line (Jeannin, Platzer et al. 2015/2017). References section updated with seven TCAS/ACAS formal verification papers.
[0.5.0] - 2026-06-24¶
Added — academic paper draft¶
-
docs/paper.md— paper draft targeting OOPSLA tool track / ISSTA / ECOOP tool track. Stakes the claim that musil is the first Python library to unify explicit-state model checking, deterministic simulation testing, and open-system contract verification. Formal grounding: Interface Automata (de Alfaro & Henzinger, ESEC/FSE 2001), contract-based design (Benveniste et al., INRIA RR-8147, 2012), I/O Automata (Lynch & Tuttle, 1987), seL4 assumption layering (Klein et al., SOSP 2009). Sections: Abstract, Introduction, Background and Related Work, Design, Implementation, Evaluation (three case studies), Limitations, Conclusion, References. -
mkdocs.yml— added paper to site nav as "Paper draft".
[0.4.0] - 2026-06-24¶
Added — pre-built adversarial environment library¶
-
musil.environments— new sub-package with five production-grade adversarial environments, each shipping namedAssumptionconstants (with real source citations), anEnvironmentSpecfactory forcheck_open, and anAdversarialNodefactory for simulation: -
k8s— pod eviction, OOM kill. Assumptions:k8s:node-capacity,k8s:scheduler-liveness,k8s:eviction-rate-bounded(all unverified; cites Kubernetes docs). Factories:pod_eviction,eviction_notifier,oom_notifier. -
aws— API throttling, request timeouts. Assumptions:aws:s3-eventual-delivery,aws:region-availability,aws:throttle-transient(unverified);aws:iam-correctness(verified; cites AWS IAM docs). Factories:aws_api_faults,throttling_service,timeout_service. -
linux— OOM kill, SIGTERM, ENOSPC filesystem errors. Assumptions:linux:oom-killer-bounded,linux:signal-delivery,linux:enospc-recoverable(unverified);linux:fs-rename-atomic(verified; cites rename(2) POSIX guarantee). Factories:process_faults,fs_faults,oom_notifier,signal_sender. -
sel4— capability revocation, IPC timeout. Assumptions:sel4:hardware-memory-safety(axiom; Klein et al., SOSP 2009 TCB),sel4:capability-integrity(verified; Klein et al., SOSP 2009 § 4 formal proof),sel4:ipc-liveness(unverified; not covered by the functional correctness proof). Factories:capability_revocation,cap_revocation_notifier. -
etcd— Raft leader change, write conflict, compaction. Assumptions:etcd:quorum-maintained,etcd:bounded-leader-election,etcd:compaction-bounded(unverified);etcd:unique-leader(verified; Ongaro & Ousterhout, USENIX ATC 2014 § 5 Raft safety). Factories:cluster_faults,etcd_adversary. -
examples/open_system_composed.py— end-to-end multi-environment example: a service composed against K8s + AWS + etcd simultaneously, showing a BFS-discovered counter-example (unbounded evictions exhaust retry budget) and the full residual proof obligation list. -
docs/environments.md— reference guide for the environments library: all assumptions, factory signatures, composition patterns, simulation usage, and custom assumption injection.
[0.3.0] - 2026-06-24¶
Added — open-system verification¶
-
EnvironmentSpec[S]— a named contract for an adversarial external component (K8s, AWS, seL4, Linux). Each spec carries:behaviors(the non-deterministicAction[S]objects the environment can take),guarantees(invariants the environment commits to), andassumptions(named proof obligations, following seL4's two-layer assurance model: Klein et al., SOSP 2009). -
Assumption— a named, machine-readable proof obligation withstatus ∈ {"axiom", "verified", "unverified"}, an optionalsourcecitation, and an optionalpredicatefor machine-checking. Axioms and unverified assumptions surface as residual obligations inOpenResult, making hidden hypotheses first-class. -
check_open(system, *envs)— verifies a system against one or more adversarial environments. Composes by merging environment behaviors into the system's action set (the BFS explores every adversarial move) and environment guarantees into the invariants (qualified as"<env.name>:<k>"). Returns anOpenResultwrapping the underlyingResultplus all unverified assumptions. With no envs,check_open(m) == check(m)exactly. Theoretical grounding: Interface Automata (de Alfaro & Henzinger, ESEC/FSE 2001) and contract-based design (Benveniste et al., INRIA RR-8147, 2012). -
OpenResult[S]— the result ofcheck_open:.ok/bool()delegates to the underlyingResult, and.unverified_assumptionssurfaces residual proof obligations. -
AdversarialNode— a simulationBaseNodethat injects Byzantine behavior via a list of(guard, response)pairs. The first matching guard fires; no match silently drops the message. Models external services that return wrong answers, violate protocols, or refuse to respond. Grounded in Byzantine fault injection (Castro & Liskov, OSDI 1999). -
NetworkModel.mutate— aCallable[[src, dst, payload], object | None]hook onNetworkModel. Applied after loss/duplicate decisions, before delivery.Nonereturn drops the message. Models wire-level Byzantine corruption: a component that sends wrong data, not just absent data. -
Worked examples —
examples/k8s_scheduler.py: a distributed service verified against K8s pod eviction (fragile service fails, resilient service with restart logic passes, residual proof obligations printed).examples/byzantine_service.py: anAdversarialNodethat returns wrong answers (client without validation fails, with validation passes) andNetworkModel(mutate=...)for wire-level payload corruption. -
docs/open-systems.md— a conceptual guide: closed vs. open systems, when to usecheck_openvs.check, and how to write anEnvironmentSpec. -
docs/api.md— new "Open systems" section documentingcheck_open,EnvironmentSpec, andAssumption.
Added — verify the implementation, not just the design¶
- Refinement (
check_refinement,RefinementMonitor) — check that an observed run of the real system refines a model: every transition is a model edge or a stutter (the runtime analog of seL4/TLA+ refinement). The oracle that holds real code to a model-checked spec. - Deterministic simulation (
simulate,Simulator,NetworkModel,BaseNode,Context) — run real, event-driven node code under a virtual clock and a fault-injecting network (loss / duplication / latency-driven reordering), reproducible from a seed.simulateruns many seeds with invariants + refinement + a convergence goal as the oracle and returns the first failing seed as a reproducibleSimFailure. This is the FoundationDB/TigerBeetle "deterministic simulation testing" technique as a pure-Python library -- bug finding, not proof. - Worked example (
examples/route_delivery.py) and a guide (Verifying a distributed system): model-check a protocol's design, then put the real code through simulation; the fire-and-forget bug is caught with a seed, the per-tick re-push converges and refines the model.
Added — the textbook (a canon of worked examples)¶
docs/textbook.md— a roadmap mapping the classic concurrency / distributed-systems problems (each bounded, ordering- or failure-driven — musil's diet) to the bug each teaches and the feature it exercises, so the example suite doubles as documentation and as musil's own coverage matrix.- Producer–consumer / bounded buffer (
examples/producer_consumer.py) — the check-then-act race: two non-atomic producers both pass the "there is room" check and overflow a one-slot buffer; the atomic test-and-insert stays within capacity, never deadlocks, and converges. - Shared-memory concurrency examples completing that textbook section: dining philosophers
(
examples/dining_philosophers.py— the circular-wait deadlock and the resource-ordering fix), readers–writers (examples/readers_writers.py— mutual exclusion holds, yet reader-preference starves the writer: a liveness/fairness lesson), and bank transfer (examples/bank_transfer.py— the lock-ordering deadlock plus a conservation invariant). examples/is now linted and type-checked (ruff + pyright strict) in CI andmake check, so the worked examples can't rot.
[0.1.0] - 2026-06-21¶
Initial release.
Added¶
- Core checker (
check) — breadth-first sweep of every reachable state for the first invariant violation or deadlock, with the shortest counterexample trace. States are immutable, hashable values (frozen dataclasses); actions are pure guarded transitions. - Concurrency by interleaving — modelling several actors is just handing
checkall their actions; every interleaving is explored. - Liveness (
check_liveness) — proves<>P("eventually P") and, witheverywhere=True,[]<>P("always eventually P" / convergence). Tarjan SCC + lasso counterexamples. Weak fairness viafair=[...]and strong fairness viafair_strong=[...]. - Composition (
compose) — the interleaved product of independent component models, lifting each component's invariants (name-qualified) and supporting joint invariants over the composite. - Channel kit (
channel_actions,send) — model a message channel as a building block: reliable / lossy / duplicating, unordered so reordering is explored for free. - Zero-drift FSM bridge (
transition_actions,status_field_actions,terminal_states,declared_states) — build a model straight from an allowed-transitions table so it can't diverge from the code. - Conformance (
generate_traces,replay) — generate traces from the model and replay them against the real implementation to check it refines the model. - Graph export (
to_dot) — Graphviz DOT of the reachable state graph, with counterexample highlighting; plusexplorefor the raw reachable graph. - CLI (
python -m musil <model.py>) — check a model module from the command line. - PEP 695 generics,
py.typed, zero runtime dependencies, Python 3.12+.
Known limitations¶
- Explicit-state: for bounded models (use the
max_statescap; results flagtruncated). - No partial-order reduction yet; heavy interleaving can be expensive.
- Liveness is fairness-based, not full LTL (
<>P/[]<>Punder weak/strong fairness).