08 · Operations

MLOps hub

Four standalone tools sit alongside the bot. Each one runs as its own systemd service on the local loopback; click out to the UI, or read the API directly from a terminal. Native widgets in this dashboard come in v2.1 — for now, the link-outs are the fastest path to the data.

Recent MLflow runs

· 20 latest

open mlflow ↗

mlflow /api/2.0/mlflow/experiments/search HTTP 404: The deployment could not be found on Vercel. DEPLOYMENT_NOT_FOUND cle1:iad1:iad1::76vfp-1785533051177-61def823b89c

Experiments

:5000 ↗

MLflow tracking

Every model fit (EMOS, AR1, city correlation, backtest) logged with hyperparams + metrics + artifacts. Runs cross-tagged with prefect_flow_run_id and DVC parquet hashes.

Tuning

:8080 ↗

Optuna studies

Hyperparameter search via TPE sampler. Each study persists to data/mlflow/optuna.db; trials surface as nested MLflow runs. Resume with same --study-name.

Flows

:4200 ↗

Prefect orchestration

Nightly flow runs (EMOS / DEB / AR1 / city-correlation / backtest). Retry history, durations, failures. APScheduler triggers; Prefect tracks.

Metrics

:8001 →

Scanner observability

Per-archetype scan rates, fire %, p95 durations, settle backlog, errors, build info. Parses the in-process Prometheus exporter and renders human-readable cards + tables. Raw scrape still at wb-metrics.0xfitz.dev for Grafana.

Grafana

:3001 ↗

Time-series drill-down

Brush-zoom on scan rates, fire rate, p95 duration, service health. Auto-loaded with the weatherbot-overview dashboard (9 panels). Bring up via `docker compose -f docker-compose.monitoring.yaml up -d` then admin / GRAFANA_ADMIN_PASSWORD.

Runtime

identity · live

archetypes: 0
versions: —
history: 0 rows

Active versions

registry · live

Every archetype's currently-active STRATEGY_VERSION. Ended rows are visible in the per-strategy detail pages. A bump appears here within seconds of a service restart that picks up the new module-level constant.

loading versions…