Query Weights & Biases experiment history to retrieve past optimization runs, compare configurations, and inform the next optimization decision. Use this skill when asked to review past AI optimization experiments, find the best configuration tried so far, understand what has already been explored, or suggest the next experiment to run based on W&B history.
Install
npx skillscat add wilfred-dore/ecostral/wandb-experiment-memory Install via the SkillsCat registry.
W&B Experiment Memory Skill
This skill enables a coding agent to act as an informed optimizer by reading
past experiment results from Weights & Biases before proposing the next action.
It closes the self-improvement loop: measure → log → remember → improve.
When to activate this skill
- User asks "what configurations have we already tried for model X?"
- User asks "what was the best result so far for this model?"
- User asks "what should we try next?"
- Before running a new Ecostral optimization (to avoid repeating past experiments)
- After an optimization run (to confirm W&B logging succeeded)
Prerequisites
pip install wandb
# W&B API key must be set in secrets.json or as WANDB_API_KEY env varStep 1 — Query past runs for a model
from ecostral.memory.wandb_mcp import WandbMemory
memory = WandbMemory(
api_key="<wandb_api_key>", # from secrets.json
entity="wdore-personal",
)
# Fetch recent runs for a specific model
# Project name pattern: optimization-<model-slug>
# e.g. optimization-mistral-7b-instruct-v0-3
runs = memory.get_recent_runs(
project="optimization-mistral-7b-instruct-v0-3",
n=20,
)
for run in runs:
print(run["precision"], run["accuracy"], run["co2_kg"], run["latency_ms"])Step 2 — Find the best configuration
# Filter runs that meet accuracy threshold
threshold = 0.9 # 90% of FP32 baseline
qualified = [r for r in runs if r.get("accuracy", 0) >= threshold]
# Sort by CO₂ (ascending = most frugal first)
best = sorted(qualified, key=lambda r: r.get("co2_kg", float("inf")))
if best:
print(f"Best config: {best[0]['precision']} / {best[0].get('technique', '?')}")
print(f" CO₂/inf : {best[0]['co2_kg']*1e6:.2f} mg")
print(f" Accuracy: {best[0]['accuracy']:.4f}")Step 3 — Identify unexplored configurations
tried_precisions = {r["precision"] for r in runs}
all_precisions = {"fp32", "bf16", "fp16", "int8", "int4", "fp8"}
unexplored = all_precisions - tried_precisions
print(f"Not yet tried: {unexplored}")Step 4 — Summarize for a Mistral reasoning prompt
# Format past runs as context for the next Mistral proposal
summary_lines = []
for r in runs[:10]:
summary_lines.append(
f"- {r['precision']}/{r.get('technique','?')}: "
f"accuracy={r.get('accuracy',0):.3f}, "
f"co2={r.get('co2_kg',0)*1e6:.1f}mg, "
f"latency={r.get('latency_ms',0):.0f}ms, "
f"vram={r.get('memory_usage_gb',0):.1f}GB"
)
context = "\n".join(summary_lines)
print(context)
# → pass this string to Mistral Large as experiment historyW&B project naming convention
| Model | W&B project |
|---|---|
| mistralai/Mistral-7B-Instruct-v0.3 | optimization-mistral-7b-instruct-v0-3 |
| TinyLlama/TinyLlama-1.1B-Chat-v1.0 | optimization-tinyllama-1-1b-chat-v1-0 |
| microsoft/Phi-3.5-mini-instruct | optimization-phi-3-5-mini-instruct |
| Any model, Jetson Orin target | optimization-<slug>-jetson-orin |
Group naming: run-<date> (e.g. run-2026-03-01) — one group per optimization session.
W&B MCP Server (programmatic access)
For MCP-compatible agents, the W&B MCP server exposes experiment history directly:
# The WandbMemory class in ecostral/memory/wandb_mcp.py wraps the W&B API
# and formats run history as structured context for LLM prompts.
# It is the same module used internally by the Ecostral optimization loop.
from ecostral.memory.wandb_mcp import WandbMemoryCombine with other skills
ecostral-optimizer(this repo) → run the next optimization based on memory insightshugging-face-evaluation(HF official) → cross-reference W&B accuracy with HF model card evalshugging-face-trackio(HF official) → if migrating experiment tracking from W&B to HF TrackIO