Agent Evals Dashboard

Published eval results from the Weave agent test harness. Scores, trends, and model comparison are fetched directly from sanitized index manifests — no raw JSON inspection required.