Tooling
Four verbs, each a distinct job, each writing a self-describing run-dir:
manifest.toml (git SHA + seeds + versions) · CSVs · figures/*.png in the house
palette · a README.md callout with the headline result.
| Verb | Job | Run-dir |
|---|---|---|
benchmark | a roster of nodes across a task grid — rank and compare, with baseline-relative statistics | bench/runs/<UTCstamp>_<shortgit>_<id>/ |
profile | one node, in depth — the full analytic suite + a behaviour GIF per task | profile/runs/<node>/<UTCstamp>_<shortgit>_<id>/ |
sweep | perturb parameter axes, measure signatures per cell | sweeps/<id>/ |
ablate | a config-free preset sweep: baseline vs each registered ablation | sweeps/ablate_<node>_<task>/ |
benchmark and profile timestamp each run (<UTCstamp>_<shortgit>_<id>, so repeats
never collide). sweep and ablate instead key the run-dir on the sweep id — a
re-run with the same id resumes/overwrites in place (completed cells are skipped), which
is why sweeps/ is not timestamped.
The commands
Every command is the real entrypoint. benchmark and profile are their own directories
with a local project; sweep and ablate are the same sweep/run.jl (ablate is a
subcommand, not a separate directory).
# benchmark — from bench/, roster across a task gridcd bench && julia --project=. run.jl --neurons falandays_base,compartmental_structured --tasks wall,pong
# profile — from profile/, one node in depthcd profile && julia --project=. run.jl falandays_base
# sweep — from the repo root, a TOML configjulia --project=. sweep/run.jl configs/sweep_falandays_wall.toml
# ablate — same runner, a subcommand: NODE TASK, no config neededjulia --project=. sweep/run.jl ablate falandays_base wallDiscover a sweep’s tunable axes before writing a config:
julia --project=. sweep/run.jl --list-axes --node falandays_base --task wallAuthoring a benchmark run
bench reads bench/configs/core.toml (or --config <path>; flags like --neurons /
--tasks / --no-gifs override it). The config sets the roster (neurons = [] means
all registered variants), the task grid, n_trials, the baseline every table is scored
against, and a [prep] block that encodes the fairness rule:
neurons = [] # empty = all registered variantstasks = ["wall", "tracking", "pong", "cartpole", "cartpole_swingup"]n_trials = 20baseline = "falandays_base"
[prep]# falandays* default to "untrained" (seeded wiring + online plasticity);# compartmental* default to "trained" (untrained non-plastic weights aren't a# meaningful benchmark). A cell needing a trained genome that has none falls# back to untrained and is flagged "trained-required-but-untrained".Source: bench/README.md; bench/configs/core.toml; bench/run.jl.
Sweep — “what makes it work, how it breaks”
A sweep perturbs the parameters that shape a run and records the analytic signatures per cell, so you can see performance and criticality against the knob.
[sweep]id = "falandays_wall_perturb" # names the run-dir: sweeps/falandays_wall_perturb/mode = "one_at_a_time" # default: each axis varied alone (Σ of axis lengths) # "factorial" = full cartesian productseeds = [0, 1, 2, 3]max_cells = 200 # cost guard; the preview shows cells × seeds rollouts
[baseline] # the canonical setup every axis perturbs aroundnode = "falandays_base"task = "wall"N = 100ticks = 2000
[axes] # namespaced parameter path -> values"node.threshold_mult" = [1.5, 1.75, 2.0, 2.25, 2.5]"node.lrate_targ" = [0.0, 0.005, 0.01, 0.02]"env.lam" = [0.5, 1.0, 2.0]"ablation" = ["none", "freeze_plasticity", "zero_recurrent"]
[analytics]measures = ["sigma_mr", "spectral_radius", "liveness"]Axis namespaces: node.*, env.*, drive.*, task.*, ablation, seed — routed
into the real simulate kwargs, validated up front (a wrong or inapplicable axis is a
clear “did you mean…” error, not a silent no-op). A failing cell records an error and the
sweep continues.
Outputs: results.csv (one row per cell: axis × value × score + each measure) ·
per-axis breakdown figures (score, σ, liveness vs the knob — with and
shown together) · a README.md callout (best value / breakdown point / regime flip) ·
cells/cell_NNN/ holding each cell’s metrics.csv and manifest. Behaviour GIFs are
opt-in: add a [capture] block (group = …) to record a representative GIF per cell,
otherwise the numeric metrics are the whole output.
Source: src/run/Sweep.jl (run_sweep, ablate, sweepable_axes); CLI sweep/run.jl; examples in configs/sweep_*.toml.