API reference¶
Everything below is re-exported from the top-level musil package, e.g. from musil import Model, check.
Core — check, Model, Action¶
The model definition and the safety/deadlock/reachability checker.
musil.core ¶
The explicit-state core: a model is (initial states, guarded actions, invariants), and check
does a breadth-first sweep of every reachable state, reporting the first invariant violation or
deadlock with the shortest trace that reaches it.
States are plain immutable, hashable Python values — a frozen dataclass is the idiomatic choice.
Actions are pure: enabled(state) decides whether the action can fire, apply(state) returns
the resulting NEW state. Because the sweep fires every enabled action from every state,
modelling several concurrent actors is just handing it all their actions — the interleaving is
explored for free. That's the whole point: the bugs that matter are races, and BFS over
interleavings finds them.
This module is dependency-free and knows nothing about any particular domain.
Action
dataclass
¶
A guarded, atomic transition. enabled decides if it can fire in a state; apply returns
the resulting new state (states are immutable, so it never mutates). Keep both pure — the checker
calls them many times and assumes no side effects.
Source code in src/musil/core.py
Step
dataclass
¶
Model
dataclass
¶
What to check: where to start, what can happen, and what must always hold.
init is one state or several. actions are the (possibly multi-actor) transitions.
invariants map a name → predicate that must hold in every reachable state. terminal
marks states allowed to have no enabled action (an absorbing sink like deleted); any other
dead end is reported as a deadlock.
Source code in src/musil/core.py
Result
dataclass
¶
The outcome. ok is True when the whole reachable space satisfied every invariant with no
deadlock. Otherwise kind is "invariant" or "deadlock", invariant names the broken
one, and trace is the shortest path from an initial state to the offending state.
Source code in src/musil/core.py
Graph
dataclass
¶
The full reachable state graph: initial states, every reachable state, and the labelled
transitions (edges[s] = the (action_name, next_state) pairs out of s). Built by
explore and consumed by liveness checking and graph export. truncated flags hitting the
max_states cap.
Source code in src/musil/core.py
format_trace ¶
Render a counterexample as Init: <state> then → <action> <state> per step.
Source code in src/musil/core.py
check ¶
Breadth-first sweep of every reachable state. Returns the first invariant violation or
deadlock found (shortest trace, because BFS visits by increasing depth), else OK once the whole
reachable space is exhausted. max_states is a runaway guard for unbounded models.
Source code in src/musil/core.py
explore ¶
Build the whole reachable state graph (no invariant checks). The shared exploration primitive
behind liveness and graph export; check stays separate so it can early-exit with the shortest
counterexample.
Source code in src/musil/core.py
Liveness — check_liveness¶
Eventually (<>P) and always-eventually ([]<>P) under weak and strong fairness.
musil.liveness ¶
Liveness: does the system eventually reach a goal, or can it loop forever short of it?
check_liveness(model, goal=P) proves the temporal property "eventually P" (<>P): on every run
from an initial state, P holds at some point. It fails in exactly two ways, and we report whichever
we find with a lasso counterexample (a stem into a recurring set, plus the loop):
- P unreachable — a reachable state from which P can never be reached (a dead end or a trap SCC with no exit to P). No scheduling fixes this.
- fair cycle — a reachable cycle of ¬P states that the system can follow forever. Whether
this counts depends on FAIRNESS: under weak fairness of an action set, an action that is
continuously enabled along the cycle must eventually be taken — so a cycle that starves a
continuously-enabled fair action is not a real counterexample (the fair scheduler would break
it). Pass
fair=[...]to rule those out; withfair=()you get strict (no-fairness) liveness.
Method: explore the reachable graph, restrict to ¬P states reachable through ¬P from a ¬P initial state, then (1) flag any from which P is unreachable, (2) Tarjan-SCC the rest and report the first weakly-fair non-trivial SCC. Dependency-free; for finite (bounded) models.
LivenessResult
dataclass
¶
Outcome of an "eventually P" check. ok True means every run reaches P. Otherwise kind
is "p-unreachable" or "fair-cycle"; stem is the shortest path from an initial state
to the offending state, and cycle is the recurring loop (empty for a P-unreachable sink).
Source code in src/musil/liveness.py
check_liveness ¶
check_liveness(
model: Model[S],
*,
goal: Callable[[S], bool],
fair: Iterable[str] = (),
fair_strong: Iterable[str] = (),
everywhere: bool = False,
goal_name: str = "goal",
max_states: int = 1000000,
) -> LivenessResult[S]
Prove a liveness property and return a lasso counterexample if it fails.
Default (everywhere=False) checks <>P — eventually P on every run from an initial
state. everywhere=True checks []<>P — always eventually P, i.e. from EVERY reachable
state P is still eventually reached (the system always re-converges; no reachable stale cycle).
Use everywhere=True for convergence/recurrence properties whose model may start already
satisfying P.
Fairness prunes cycles a fair scheduler would break. fair = weak fairness (an action
continuously enabled along a cycle must eventually be taken). fair_strong = strong fairness
(an action enabled infinitely often — i.e. in any state of the cycle — must be taken); strong
is the stronger assumption and rules out more cycles, so it can force a "flickering" action that
weak fairness cannot.
Source code in src/musil/liveness.py
Composition — compose¶
The interleaved product of independent component models.
musil.compose ¶
Parallel composition of component models (I/O-automata style).
Specify each component independently as a Model (states + named actions), then compose them into
one Model whose state is the tuple of component states and whose actions are the interleaving of all
components' actions, each lifted to act on its own slice. This is the "specify per component, compose"
discipline from the methodology note (docs/specs/formal-modeling-distributed-systems.md): the joint
behaviour is every interleaving of the parts, which is exactly where concurrency bugs live.
Component actions are name-qualified ("<component>:<action>") and component invariants are lifted
(checked against that component's slice). Pass extra invariants to compose for joint properties
that span components. The composite is terminal only when every component is terminal.
Composite
dataclass
¶
A joint state: one component state per name, kept sorted by name so it hashes deterministically.
Read a component's slice with state["name"].
Source code in src/musil/compose.py
compose ¶
compose(
components: Mapping[str, Model[Any]],
*,
invariants: Mapping[str, Invariant[Composite]]
| None = None,
) -> Model[Composite]
Interleaving product of components (name -> Model). The result's reachable states are every
interleaving of the parts; component invariants are lifted ("<name>:<inv>") and invariants
adds joint properties over the whole composite.
Source code in src/musil/compose.py
Channels — channel_actions, send¶
Model a message channel (reliable / lossy / duplicating; unordered, so reordering is free).
musil.channels ¶
Generic message-channel modeling kit -- the "model the network explicitly" principle as a reusable building block (methodology doc section 3).
A channel's in-flight messages are held as a frozenset somewhere in your model's state. This kit
builds the channel's actions over that state: deliver a message (apply it to the receiver), and --
depending on the channel's reliability -- drop it (loss) or keep it after delivery (duplication).
Because the in-flight set is unordered and deliver may fire for ANY in-flight message, reordering
is modelled for free; lossy adds message loss; duplicating allows redelivery. (A set dedups a
re-send of the same message -- which matches an idempotent re-push. FIFO/ordered channels would need
a sequence instead; the realistic network is unordered, which is what this models.)
Domain-free: messages are any hashable values the caller supplies. Use send inside a sender's
action to inject a message, and channel_actions to add the channel's own steps to the model.
send ¶
send(
get: Callable[[S], frozenset[Message]],
put: Callable[[S, frozenset[Message]], S],
) -> Callable[[S, Message], S]
Return a helper send(state, msg) that injects msg into the channel's in-flight set
(idempotent: re-sending an in-flight message is a no-op, like an idempotent re-push).
Source code in src/musil/channels.py
channel_actions ¶
channel_actions(
*,
messages: Iterable[Message],
get: Callable[[S], frozenset[Message]],
put: Callable[[S, frozenset[Message]], S],
deliver: Callable[[S, Message], S],
lossy: bool = False,
duplicating: bool = False,
name: str = "ch",
) -> list[Action[S]]
The channel's steps over the caller's state. get/put read/write the in-flight set;
deliver(state, msg) applies a delivered message to the receiver. messages is the (bounded)
universe of possible messages. lossy adds drop steps; duplicating lets a message be
delivered more than once. Delivery order is unconstrained (reordering).
Source code in src/musil/channels.py
FSM bridge — transition_actions, status_field_actions¶
Build a model straight from an allowed-transitions table, so it can't drift from the code.
musil.fsm ¶
Build a model straight from a transition table — {from_state: {to_states}} — instead of
hand-writing one Action per edge. This is the zero-drift bridge: point it at the very table your
code enforces (e.g. a state machine's allowed-transitions map) and the model can never disagree
with the code, because it is the code's table. No separate spec file to keep in sync.
Generic over the state type: you supply get (read the status out of a state) and with_status
(return a new state with the status replaced), so the table can drive any state shape — typically a
frozen dataclass with a status field plus whatever other variables your invariants need.
transition_actions ¶
transition_actions(
table: Table,
*,
get: Callable[[S], str],
with_status: Callable[[S, str], S],
name: Callable[[str, str], str] = lambda src, dst: (
f"{src}->{dst}"
),
) -> list[Action[S]]
One Action per declared src -> dst edge. get reads the current status from a state;
with_status returns a new state with the status set to dst (states are immutable).
Source code in src/musil/fsm.py
status_field_actions ¶
Convenience for the common case: states are frozen dataclasses whose status lives in a single
field (default status). Builds actions that read/replace that field via dataclasses.replace.
Source code in src/musil/fsm.py
declared_states ¶
terminal_states ¶
States with no outgoing transitions — absorbing sinks (a key mapped to an empty set, or a
target that's never a source). Use to build a model's terminal predicate so a legitimate
resting/sink state isn't reported as a deadlock.
Source code in src/musil/fsm.py
Conformance — generate_traces, replay¶
Generate traces from the model and replay them against the real implementation.
musil.conformance ¶
Conformance: does the real implementation behave like the model?
Model checking proves the design is sound. Conformance testing checks the code matches the design: generate action traces from the model (sequences that cover the reachable graph), then replay each against the real system through a thin adapter, asserting the implementation lands in the state the model predicted after every step. It does not prove refinement (that's theorem-proving territory), but it checks it exhaustively over the model's traces against the running code — the practical "implementation faithful to spec" bridge (see docs/specs/RESEARCH.md: P, Stateright, ioco).
Two pieces
generate_traces(model)-> action sequences (lists of Steps,<init>first) that together cover every reachable edge (or state).replay(model, traces, step_fn, project, start)-> ConformanceResult.step_fn(real, action)drives the real system;project(real)maps it back to a model state to compare. The first divergence (real projected state != model-predicted state) is reported.
Divergence
dataclass
¶
The implementation departed from the model: after action (step step_index of trace
trace_index), the model predicted expected but the implementation projected to actual.
Source code in src/musil/conformance.py
generate_traces ¶
generate_traces(
model: Model[S],
*,
cover: str = "edges",
max_traces: int = 100000,
max_states: int = 1000000,
) -> list[tuple[Step[S], ...]]
Action sequences from the reachable graph (each a tuple of Steps, <init> first). With
cover='edges' (default) the set exercises every reachable transition at least once; with
cover='states', a shortest path to each reachable state. Each trace is a shortest stem, so
replays stay short.
Source code in src/musil/conformance.py
replay ¶
replay(
traces: Iterable[Sequence[Step[S]]],
*,
step_fn: Callable[[R, str], R],
project: Callable[[R], S],
start: R,
) -> ConformanceResult[S]
Replay each trace (from generate_traces) against the real system. start is the real
initial state (its projection must equal the model's initial state); step_fn(real, action)
applies an action to the real system; project maps a real state back to a model state for
comparison. Single-initial-state models; for multiple inits, project(start) must match each
trace's initial state.
Source code in src/musil/conformance.py
Refinement — check_refinement, RefinementMonitor¶
Check that an observed run of the real system refines the model (each transition is a model edge or a
stutter) -- the oracle simulate uses to hold real code to a model-checked spec.
musil.refine ¶
Refinement: hold real code to a model by checking its observed states, not its actions.
A run of the real system produces a sequence of concrete states (snapshots). An abstraction maps
each concrete state to a model state. The run refines the model iff every consecutive pair maps
to either the same abstract state (a stutter -- the real code took an internal step the model doesn't
care about) or to an abstract transition the model actually allows. The first observation must map to
a model initial state (unless require_init=False, for checking a mid-run fragment).
This is the runtime analog of refinement in the seL4/TLA+ sense (concrete behaviors are a subset of
abstract behaviors), done by checking against the explored model rather than proving -- the
practical bridge from a model-checked design to the running implementation. musil.sim feeds a
RefinementMonitor a snapshot after every step so the simulated real code is held to the spec.
Complement of conformance (which drives the real code with model-generated action labels); here
the real code drives itself and we watch where it goes.
RefinementViolation
dataclass
¶
The real run left the model. reason is "bad-init" (first state isn't a model initial
state), "unknown-state" (mapped to a state the model can't reach), or "no-model-step" (the
abstract transition isn't a model edge). index is the offending observation's position.
Source code in src/musil/refine.py
RefinementMonitor ¶
Streaming refinement check. Construct with a model and an abstraction; feed observed concrete
states one at a time to observe, which returns a RefinementViolation the first time the
run leaves the model (and None otherwise). The model is explored once, up front, so this is
for bounded models (the explicit-state regime musil lives in).
Source code in src/musil/refine.py
check_refinement ¶
check_refinement(
model: Model[S],
observations: Iterable[C],
abstraction: Callable[[C], S],
*,
require_init: bool = True,
max_states: int = 1000000,
) -> RefinementResult[C, S]
Check a whole observed run against the model. Returns at the first divergence (with the
offending observation), else OK once the run is exhausted. abstraction maps each observed
concrete state to a model state; require_init demands the first observation be a model initial
state (set False to check a mid-run fragment).
Source code in src/musil/refine.py
Simulation — simulate, Simulator, NetworkModel, FaultSchedule¶
Deterministic simulation testing: run real event-driven node code under a controlled clock and a
fault-injecting network, reproducible from a seed, with invariants + refinement + a convergence goal
as the oracle. Beyond NetworkModel's independent per-message loss, a FaultSchedule of sustained,
heal-able faults — network Partitions and node Crashes (with optional cold restart) — drives
VOPR-style liveness testing: after the faults heal, the system must converge to its goal (a
failure to do so is reported as kind="liveness").
musil.sim ¶
Deterministic simulation: run real, event-driven node code under a controlled world.
The expensive distributed bugs hide in timing -- message loss, reordering, a timeout racing a reply, a partition healing at the wrong moment. This runs your actual node logic against a virtual network and clock where all of that is controlled and seeded, so every run is reproducible from its seed (a found bug replays exactly). It's the FoundationDB/TigerBeetle "deterministic simulation testing" technique as a pure-Python library -- bug finding, not proof.
Model: nodes are event-driven actors (on_start / on_message / on_timer) that act through a
Context (send a message, set a timer, read the virtual clock, draw seeded randomness). The
Simulator owns a discrete-event queue ordered by (virtual time, sequence); the NetworkModel
decides per-message loss / duplication / latency, so reordering and delay emerge from latency spread.
A FaultSchedule layers on sustained faults -- network partitions and node crashes that hold for a
window then heal -- so simulate can check liveness: does the system converge once they lift?
All randomness comes from one seeded RNG. musil.refine + invariants are the oracle (next slice:
simulate).
Envelope
dataclass
¶
A message in flight: a unique sequence number, source, destination, and payload.
Source code in src/musil/sim.py
NetworkModel
dataclass
¶
How the virtual network treats each message. Defaults are a reliable, in-order channel; set
loss/duplicate (probabilities in [0, 1]) and a min_latency/max_latency spread
(virtual-time units) to inject drops, duplicates, and reordering. All draws use the sim's seed.
mutate is a Byzantine fault hook: mutate(src, dst, payload) -> object | None. The
returned value replaces the payload before delivery; returning None drops the message.
Applied after loss/duplicate decisions, so it targets messages that survive the network model.
Models wire-level corruption, wrong-answer injection, or selective drops (Castro & Liskov, OSDI
1999).
Source code in src/musil/sim.py
Partition
dataclass
¶
A sustained network fault: every directed link in links ((src, dst) pairs) drops ALL
traffic while active -- the window [start, end) in virtual time -- then heals. Unlike
NetworkModel.loss (independent per message, which a retrying sender papers over), a partition
cuts the link wholesale, so the system can only recover once it lifts. This is what makes a real
liveness test: does it converge after the partition heals?
Source code in src/musil/sim.py
Crash
dataclass
¶
A sustained process fault: node is down over [start, end) -- inbound messages are
dropped, its timers don't fire, it runs no handlers -- then restarts. lose_state=True is a cold
restart (a fresh node instance that re-runs on_start); the default is a warm resume that keeps
the instance's state (it simply missed everything while down).
Source code in src/musil/sim.py
FaultSchedule
dataclass
¶
A fixed set of sustained faults for one run. heal_time() is when the last fault lifts (the
world is clean again) -- the point after which a live system must converge to its goal. Build one
explicitly, or per-seed via a Callable[[Random], FaultSchedule] so randomized schedules stay
seed-reproducible.
Source code in src/musil/sim.py
RunResult
dataclass
¶
Outcome of a run. reason is "quiescent" (queue drained), "until" (goal predicate
met), or "max-steps" (step cap hit).
Source code in src/musil/sim.py
Context ¶
A node's handle on the world, passed to every handler. Lets the node send messages, arm/cancel
timers, read the virtual clock (now), and draw seeded randomness -- never use real time or the
global random module in a node, or runs stop being reproducible.
Source code in src/musil/sim.py
set_timer ¶
Arm (or re-arm) a timer; it fires delay virtual-time units from now via on_timer.
Re-arming a live timer of the same name replaces it.
Source code in src/musil/sim.py
Node ¶
Bases: Protocol
What the simulator needs from a node: three event handlers. Subclass BaseNode and override
only the ones you need.
Source code in src/musil/sim.py
BaseNode ¶
A node with no-op handlers; override what you need. Keep all mutable state on the instance.
Source code in src/musil/sim.py
AdversarialNode ¶
Bases: BaseNode
A simulation node that injects adversarial behavior via a prioritized behavior list.
Models an external component (a K8s node, an AWS service, a Byzantine peer) that sometimes
misbehaves. Each behavior is a (guard, response) pair:
guard(src, payload) -> bool— returns True when this behavior should fire.response(ctx, src, payload) -> None— the adversarial action: send wrong data, send nothing, send duplicate responses, violate the protocol.
On each on_message, behaviors are tried in order; the first matching guard fires and
no further behaviors are tried. If no guard matches, the message is silently dropped —
the default adversarial posture (an external service that ignores requests).
All randomness in behavior callbacks must go through ctx.random() / ctx.randint()
so runs stay reproducible across seeds. This is the simulation counterpart of
EnvironmentSpec: same contract theory, different execution model.
Source code in src/musil/sim.py
Simulator ¶
Runs a fixed set of named nodes under a virtual clock and network. Construct with the node
map, a seed, and an optional NetworkModel; call run. Determinism: identical
(nodes, seed, network) always produce the identical run.
Source code in src/musil/sim.py
244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 | |
nodes
property
¶
The live node map -- the authoritative instances, which a cold restart swaps. The oracle snapshots through this so it always reads post-restart state.
random ¶
randint ¶
inject ¶
Send a message (from a node's ctx.send, or externally to kick the system). Subject to
the network model: it may be dropped, duplicated, or delayed.
Source code in src/musil/sim.py
schedule_timer ¶
Arm a timer for node that fires delay units from now (re-arming replaces it).
Source code in src/musil/sim.py
run ¶
run(
*,
max_steps: int = 10000,
until: Callable[[Simulator], bool] | None = None,
on_step: Callable[[Simulator], None] | None = None,
after_start: Callable[[Simulator], None] | None = None,
) -> RunResult
Drive the system to quiescence (queue empty), until until(sim) is true, or until
max_steps events have fired. after_start(sim) runs once after every node's
on_start (the initial world); on_step(sim) runs after each fired event -- the hooks
the oracle uses to snapshot and check the world.
Source code in src/musil/sim.py
SimFailure
dataclass
¶
A reproducible failure from simulate. kind is "invariant", "refinement",
"goal" (settled-but-goal-false with no faults), or "liveness" (under a FaultSchedule,
the system did not converge to the goal after the faults healed). detail names the broken
invariant / refinement violation / goal. world is the offending snapshot and history the
snapshots leading to it (a trace). Re-run with seeds=[seed] to reproduce exactly.
Source code in src/musil/sim.py
simulate ¶
simulate(
factory: Callable[[], Mapping[str, Node]],
*,
seeds: Iterable[int],
snapshot: Callable[[Mapping[str, Node]], W],
network: NetworkModel | None = None,
faults: FaultSchedule
| Callable[[Random], FaultSchedule]
| None = None,
invariants: Mapping[str, Callable[[W], bool]]
| None = None,
model: Model[S] | None = None,
abstraction: Callable[[W], S] | None = None,
goal: Callable[[W], bool] | None = None,
goal_name: str = "goal",
converge_within: int | None = None,
max_steps: int = 10000,
require_init: bool = True,
) -> SimReport[W]
Deterministic simulation testing of the real nodes. Runs the system once per seed under
network (independent per-message faults) and an optional faults schedule (sustained,
heal-able partitions / crashes -- a FaultSchedule or a per-seed Random -> FaultSchedule).
After every step it checks each invariant and (if a model + abstraction are given) that the
run refines the model -- safety holds during the chaos too. If a goal is given, the settled
final world must satisfy it: with a non-trivial faults schedule this is the liveness check
(did the system converge once the faults healed? -> kind="liveness"); with no faults it is the
plain convergence check (kind="goal"). converge_within (steps after the schedule's
heal_time) catches livelock early. Returns at the FIRST failing seed -- a reproducible repro
(report.failure.seed). factory builds a fresh node map per run (also reused to cold-restart
a crashed node); snapshot maps the live nodes to an immutable world value used by the oracle.
Source code in src/musil/sim.py
Open systems — check_open, EnvironmentSpec, Assumption¶
Verify a system against adversarial external environments. Each EnvironmentSpec is a contract:
the non-deterministic behaviors the environment can exhibit (eviction, crash, wrong answer), the
guarantees it commits to, and the assumptions those guarantees rest on (following seL4's
two-layer assurance model: Klein et al., SOSP 2009). check_open composes by merging environment
behaviors into the system's action set (so the BFS explores every adversarial move) and environment
guarantees into the invariants. The result surface all Assumption objects that are not yet
"verified" as residual proof obligations.
musil.env ¶
Open-system verification: check a system against adversarial external environments.
Real distributed systems do not operate in isolation. A service runs on Linux, inside K8s, talking
to AWS -- components that can crash, evict, throttle, or return wrong answers at any moment. musil's
closed-system check assumes the entire state space is under the model's control. check_open
lifts that restriction: each external component is an EnvironmentSpec -- a contract consisting of
the non-deterministic behaviors it can exhibit, the guarantees it commits to, and the
assumptions those guarantees rest on (following seL4's two-layer assurance model: Klein et al.,
SOSP 2009).
check_open composes by merging: env behaviors become additional actions (the BFS explores
every thing the environment can do), env guarantees become additional invariants. This is
Interface Automata composition (de Alfaro & Henzinger, ESEC/FSE 2001) -- the environment acts
adversarially within its contract; the system must handle all of it.
Assumptions are first-class: they are printed in the result, never silently swallowed. A guarantee that rests on an unverified assumption is weaker than one that rests on a verified proof -- the caller knows exactly what the verification is worth.
Assumption
dataclass
¶
A named proof obligation taken on faith or verified elsewhere.
status tracks where the assumption stands:
- "axiom" — taken as a foundation; not expected to be proved (e.g., hardware correctness)
- "verified" — machine-checked or formally proved (e.g., seL4 kernel proof)
- "unverified" — stated but not yet validated (the default; should be addressed)
source cites where the assumption comes from (paper, spec, URL).
predicate is an optional machine-checkable version of the assumption; None for
assumptions that cannot be expressed as a state predicate (e.g., "network eventually delivers").
Source code in src/musil/env.py
EnvironmentSpec
dataclass
¶
A contract for one external component: what it can do, what it promises, and what it assumes.
behaviors — the non-deterministic actions the environment can take. Each is an Action[S]
that fires from the same global state the system uses. The BFS in check_open explores
every possible environment action from every state, so the system is checked against the worst
possible environment (within the contract). Model each distinct failure mode -- crash, evict,
wrong-answer, timeout -- as a separate behavior.
guarantees — invariants the environment commits to hold in every reachable state. If the
environment breaks one, check_open reports it as "<env.name>:<guarantee_name>".
assumptions — named Assumption objects the guarantees rest on. Unverified assumptions
are surfaced in OpenResult.unverified_assumptions; they are never silently dropped.
Source code in src/musil/env.py
OpenResult
dataclass
¶
The outcome of check_open. Wraps the underlying Result and surfaces every
Assumption that is not yet "verified" -- these are the residual proof obligations
the caller must discharge for the verification to be unconditional.
bool(open_result) delegates to result.ok so it works in assertions.
Source code in src/musil/env.py
check_open ¶
check_open(
system: Model[S],
*envs: EnvironmentSpec[S],
max_states: int = 1000000,
) -> OpenResult[S]
Check system against adversarial external environments.
Composes the system and every environment by merging:
- all env.behaviors into the system's actions (the BFS explores every env action)
- all env.guarantees into the system's invariants (qualified as "<env.name>:<k>")
Then calls the standard check on the composed model.
check_open(system) with no environments is identical to check(system).
Unverified assumptions from all environments are collected and returned in
OpenResult.unverified_assumptions -- they are the residual proof obligations.
Source code in src/musil/env.py
Graph export — to_dot, explore¶
The reachable state graph and its Graphviz DOT rendering.
musil.graph ¶
Render a model's reachable state graph as Graphviz DOT — the same artifact FizzBee emits, but
from the Python model. to_dot(model) explores the model and returns a DOT string you can pipe to
dot -Tsvg. Pass highlight (a sequence of states, e.g. the states of a counterexample trace)
to colour that path — handy for visualising why a check failed.
to_dot ¶
to_dot(
source: Model[S] | Graph[S],
*,
highlight: Iterable[S] = (),
name: str = "musil",
max_states: int = 1000000,
) -> str
Graphviz DOT for the reachable state graph. source is a Model (explored here) or a
pre-built Graph. Initial states are drawn bold; highlight states are filled and edges between
two highlighted states are red (e.g. highlight=[s.state for s in result.trace]).