Skip to content

Weave Blog

Implementation notes from the Weave codebase

Engineering write-ups about orchestration, evals, workflows, and the systems behind Weave.

pgermishuys · 7 min read

Agent Evals in Weave

What agent evaluations are, the main types that matter, and how Weave structures them to keep comparisons useful and honest.

Agent Evals in Weave ​ It is easy to build an impressive agent demo. It is much harder to know whether the agent is actually improving. That is the role of an agent evaluation: taking a behavior you care about and turning it into something you can measure repeatedly.

Released under the MIT License.