Skip to content

Scripts & CLI

Scripts used by ml-lab and ml-journal. All are stdlib-only Python (no external dependencies beyond the standard library).

journal_log.py

Location: .project-log/journal_log.py (installed by /log-init)

Entry writer. Validates fields against the type schema, constructs the envelope (id, timestamp, type, project, session_id), and appends a single JSON line to journal.jsonl.

Usage: Called by ml-journal skills internally. Can also be invoked via uv run log_entry.py for ml-lab's investigation log.

Warning

Always use uv run log_entry.py or /log-entry for journal writes. Manual JSONL appends break schema validation and sequence monotonicity.


journal_query.py

Location: .project-log/journal_query.py (installed by /log-init)

Read-only query interface for the journal.

Operations:

# Journal health overview
python3 .project-log/journal_query.py --status

# List entries by type (with optional time filter)
python3 .project-log/journal_query.py --list <type> [--since <duration>]

# Unresolved issues
python3 .project-log/journal_query.py --unresolved-issues

# Resolved issues with linked resolutions
python3 .project-log/journal_query.py --resolved-issues

# Recent entries (with optional time filter)
python3 .project-log/journal_query.py --recent <N> [--since <duration>]

# Lookup by ID prefix
python3 .project-log/journal_query.py --entry <id_prefix>

# Latest checkpoint
python3 .project-log/journal_query.py --latest-checkpoint

Time durations: 7d (days), 4h (hours), 30m (minutes).


log_entry.py

Location: Experiment directory root (e.g., experiments/self_debate_experiment_v8/)

Investigation log writer for ml-lab experiments. Appends sequenced entries to INVESTIGATION_LOG.jsonl with step numbers and timestamps.

Usage:

uv run log_entry.py --step <N> --type <type> --description "..."

generate_figures.py

Location: Repository root

Generates publication-ready figures from experiment results using matplotlib. Uses PEP 723 inline metadata for dependency resolution.

Usage:

uv run generate_figures.py

flow-lint.py

Location: plugins/ml-lab/skills/pipeline-init/scripts/flow-lint.py (shipped by /pipeline-init)

Deterministic static linter for promoted Metaflow flows. Uses PEP 723 inline metadata; stdlib ast only — no project dependencies, never imports the flow under test.

Detects five mechanical anti-patterns:

Rule ID What it detects
merge-artifacts-module merge_artifacts() with no include=/exclude=; an unconstrained merge raises on nn.Module/tensor artifacts
cwd-relative-config Config(default=...) not anchored to __file__; CWD-relative paths break under uv run script mode
per-config-foreach foreach= attribute names a per-method/config list instead of the dataset/data-axis grain
bare-project-import First-party import with no preceding module-scope sys.path.insert(...) shim
module-global-experiment-const Module-level numeric constant read inside a @step/@card body; belongs in Hydra config

Reports findings as file:line [rule-id] message. Exits nonzero on any finding.

Usage:

uv run flow-lint.py <flow.py> [<flow.py> ...]

Note

The script path is resolved portably from the plugin registry (installed_plugins.jsoninstallPath), matching the same pattern used by derive_verdict.py.


determinism-check.py

Location: plugins/ml-lab/skills/pipeline-init/scripts/determinism-check.py (shipped by /pipeline-init)

Prove layer for promoted Metaflow flows. Verifies that a flow holds its declared experiment.determinism contract by diffing two runs' aggregate outputs (lift_results, aggregate_results) for exact equality, order-insensitive over cells. Uses PEP 723 inline metadata; requires metaflow>=2.19.

Reads the contract from run.data.determinism:

Contract Verification behavior
order_independent Two runs (any --max-workers) must produce identical aggregate outputs
single_worker Two runs both at --max-workers 1 must produce identical outputs
nondeterministic Verification skipped (escape hatch, exits 0)

Warning

This is not a reproduce-the-PoC check. It validates the promoted flow against itself across two runs — confirming that the pipeline's own declared contract holds.

Usage:

# Compare two specific runs
uv run determinism-check.py <run_a_pathspec> <run_b_pathspec>

# Compare the latest two runs of a named flow
uv run determinism-check.py --flow <FlowName>

# Override the contract (fallback when run artifact is absent)
uv run determinism-check.py <run_a> <run_b> --contract order_independent

Hook scripts

journal-precompact.sh

Location: plugins/ml-journal/journal-precompact.sh

Auto-writes a checkpoint before /compact runs. Configure as a PreCompact hook.

journal-session-start.sh

Location: plugins/ml-journal/journal-session-start.sh

Injects the latest checkpoint as session context at session start. Configure as a SessionStart hook.

sync-plugin-cache.sh

Location: .claude/hooks/sync-plugin-cache.sh

PostToolUse hook that auto-rsyncs edits in plugins/ml-lab/ or plugins/ml-journal/ to the versioned plugin cache. Fires on any file edit within the plugin directories.