Scripts & CLI¶
Scripts used by ml-lab and ml-journal. All are stdlib-only Python (no external dependencies beyond the standard library).
journal_log.py¶
Location: .project-log/journal_log.py (installed by /log-init)
Entry writer. Validates fields against the type schema, constructs the envelope (id, timestamp, type, project, session_id), and appends a single JSON line to journal.jsonl.
Usage: Called by ml-journal skills internally. Can also be invoked via uv run log_entry.py for ml-lab's investigation log.
Warning
Always use uv run log_entry.py or /log-entry for journal writes. Manual JSONL appends break schema validation and sequence monotonicity.
journal_query.py¶
Location: .project-log/journal_query.py (installed by /log-init)
Read-only query interface for the journal.
Operations:
# Journal health overview
python3 .project-log/journal_query.py --status
# List entries by type (with optional time filter)
python3 .project-log/journal_query.py --list <type> [--since <duration>]
# Unresolved issues
python3 .project-log/journal_query.py --unresolved-issues
# Resolved issues with linked resolutions
python3 .project-log/journal_query.py --resolved-issues
# Recent entries (with optional time filter)
python3 .project-log/journal_query.py --recent <N> [--since <duration>]
# Lookup by ID prefix
python3 .project-log/journal_query.py --entry <id_prefix>
# Latest checkpoint
python3 .project-log/journal_query.py --latest-checkpoint
Time durations: 7d (days), 4h (hours), 30m (minutes).
log_entry.py¶
Location: Experiment directory root (e.g., experiments/self_debate_experiment_v8/)
Investigation log writer for ml-lab experiments. Appends sequenced entries to INVESTIGATION_LOG.jsonl with step numbers and timestamps.
Usage:
generate_figures.py¶
Location: Repository root
Generates publication-ready figures from experiment results using matplotlib. Uses PEP 723 inline metadata for dependency resolution.
Usage:
flow-lint.py¶
Location: plugins/ml-lab/skills/pipeline-init/scripts/flow-lint.py (shipped by /pipeline-init)
Deterministic static linter for promoted Metaflow flows. Uses PEP 723 inline metadata; stdlib ast only — no project dependencies, never imports the flow under test.
Detects five mechanical anti-patterns:
| Rule ID | What it detects |
|---|---|
merge-artifacts-module |
merge_artifacts() with no include=/exclude=; an unconstrained merge raises on nn.Module/tensor artifacts |
cwd-relative-config |
Config(default=...) not anchored to __file__; CWD-relative paths break under uv run script mode |
per-config-foreach |
foreach= attribute names a per-method/config list instead of the dataset/data-axis grain |
bare-project-import |
First-party import with no preceding module-scope sys.path.insert(...) shim |
module-global-experiment-const |
Module-level numeric constant read inside a @step/@card body; belongs in Hydra config |
Reports findings as file:line [rule-id] message. Exits nonzero on any finding.
Usage:
Note
The script path is resolved portably from the plugin registry (installed_plugins.json → installPath), matching the same pattern used by derive_verdict.py.
determinism-check.py¶
Location: plugins/ml-lab/skills/pipeline-init/scripts/determinism-check.py (shipped by /pipeline-init)
Prove layer for promoted Metaflow flows. Verifies that a flow holds its declared experiment.determinism contract by diffing two runs' aggregate outputs (lift_results, aggregate_results) for exact equality, order-insensitive over cells. Uses PEP 723 inline metadata; requires metaflow>=2.19.
Reads the contract from run.data.determinism:
| Contract | Verification behavior |
|---|---|
order_independent |
Two runs (any --max-workers) must produce identical aggregate outputs |
single_worker |
Two runs both at --max-workers 1 must produce identical outputs |
nondeterministic |
Verification skipped (escape hatch, exits 0) |
Warning
This is not a reproduce-the-PoC check. It validates the promoted flow against itself across two runs — confirming that the pipeline's own declared contract holds.
Usage:
# Compare two specific runs
uv run determinism-check.py <run_a_pathspec> <run_b_pathspec>
# Compare the latest two runs of a named flow
uv run determinism-check.py --flow <FlowName>
# Override the contract (fallback when run artifact is absent)
uv run determinism-check.py <run_a> <run_b> --contract order_independent
Hook scripts¶
journal-precompact.sh¶
Location: plugins/ml-journal/journal-precompact.sh
Auto-writes a checkpoint before /compact runs. Configure as a PreCompact hook.
journal-session-start.sh¶
Location: plugins/ml-journal/journal-session-start.sh
Injects the latest checkpoint as session context at session start. Configure as a SessionStart hook.
sync-plugin-cache.sh¶
Location: .claude/hooks/sync-plugin-cache.sh
PostToolUse hook that auto-rsyncs edits in plugins/ml-lab/ or plugins/ml-journal/ to the versioned plugin cache. Fires on any file edit within the plugin directories.