Skip to content

Model Selection Guide

Weave assigns each agent a default model, but you can override them via Agent Configuration. This guide gives practical advice on which models work well for each agent role when the defaults are not available to you.

Agent Tiers

Weave's agents fall into three tiers based on how much reasoning power they need:

TierAgentsWhat They DoWhat They Need
TopLoom, Pattern, WarpOrchestration, planning, security auditingThe strongest reasoning you can afford. These agents make decisions that cascade through the entire workflow — bad judgment here means bad results everywhere.
MidTapestry, Weft, ShuttleCode execution, reviewing, domain tasksGood coding ability and comprehension. They follow structured plans and analyze code — they don't need to be brilliant, but they can't be sloppy.
EconomyThread, SpindleCodebase search, web researchSpeed over depth. These agents read files and search — they don't make architectural decisions. Fast, cheap models work great here.

Anthropic

Opus-class models (Opus 4, Opus 4.5, etc.) — use for Loom, Pattern, Warp

Opus models are the best fit for top-tier agents. They have the deepest reasoning, handle complex multi-step delegation well, and catch subtle issues in security auditing. If you can only afford Opus for one agent, prioritize Loom — it is the orchestrator and bad decisions there affect everything downstream.

Sonnet-class models (Sonnet 4, Sonnet 4.5, etc.) — use for Tapestry, Weft, Shuttle

Sonnet is the sweet spot for execution and review. Fast enough to not slow you down, smart enough to write good code and catch real issues in review. Sonnet also works as a fallback for top-tier agents if Opus is too expensive — you will notice some quality degradation in complex planning, but it is workable.

Haiku-class models (Haiku 4, Haiku 4.5, etc.) — use for Thread, Spindle

Haiku is purpose-built for the economy tier. Thread and Spindle do high-volume, low-complexity work — reading files, searching code, fetching docs. Haiku handles this perfectly and keeps costs low. Do not use Haiku for mid-tier or top-tier agents — it will noticeably struggle with complex code generation and nuanced reasoning.

OpenAI

GPT-4o / GPT-5 — use for Loom, Pattern, Warp

These are OpenAI's strongest general-purpose models. Good reasoning, good function calling, good at following complex system prompts. Either works well for top-tier agents.

GPT-4o-mini / GPT-4.1-mini — use for Tapestry, Weft, Shuttle, Thread, Spindle

The mini models are fast and capable enough for execution work. They handle code generation, review, and search well. For economy agents (Thread, Spindle), they are slightly overpowered but work fine — there is not a cheaper OpenAI option with reliable tool calling.

GPT-4.1 — a solid mid-tier option for Tapestry, Weft, Shuttle

GPT-4.1 has strong coding ability and a 1M token context window. It is a good fit for Tapestry when working on large codebases where you need the model to hold a lot of context. It is not a reasoning model, so keep it in the mid tier.

o3 / o4-mini Reasoning Models

The o-series models are powerful reasoners but behave differently — they use internal chain-of-thought, can be slower, and do not always stream well. They can work for Pattern and Warp (where deep thinking helps), but test before committing. They are not ideal for Loom, which needs to be responsive and make quick delegation decisions.

Per-Agent Guidance

Loom (Orchestrator) — needs the best you have

Loom decides what to delegate, to whom, and in what order. It reads your request, breaks it into tasks, picks the right specialist, and writes their instructions. If Loom misunderstands your intent or makes a poor delegation choice, everything downstream suffers. This is not the place to save money.

Best: Opus-class, GPT-5 | Acceptable: Sonnet-class, GPT-4o | Avoid: Haiku, mini models

Pattern (Planner) — deep reasoning, architectural thinking

Pattern analyzes your codebase and produces detailed implementation plans. It needs to understand file dependencies, anticipate edge cases, and order tasks correctly. Weak models produce plans that look reasonable but fall apart during execution.

Best: Opus-class, GPT-5 | Acceptable: Sonnet-class, GPT-4o, o3 | Avoid: Haiku, mini models

Warp (Security Auditor) — skeptical, deep analysis

Warp looks for security vulnerabilities and spec violations. It needs to understand OAuth flows, JWT validation, CORS policies, and subtle injection vectors. Security auditing is one of the hardest tasks for a model — cheap models miss real issues and flag false positives.

Best: Opus-class, o3, GPT-5 | Acceptable: Sonnet-class, GPT-4o | Avoid: Haiku, mini models

Tapestry (Executor) — reliable code generation

Tapestry writes actual code, following plans step by step. It needs to generate correct, idiomatic code and verify its own work. It does not need to be creative — it needs to be reliable.

Best: Sonnet-class, GPT-4o | Acceptable: GPT-4.1, GPT-4o-mini | Avoid: Haiku

Weft (Reviewer) — balanced critical analysis

Weft reviews code and plans. It needs to spot real problems without drowning you in nitpicks. A model that is too weak will miss issues; a model that is too strong is wasted here (though not harmful).

Best: Sonnet-class, GPT-4o | Acceptable: GPT-4.1, GPT-4o-mini | Avoid: Haiku

Thread and Spindle (Explorers) — fast and cheap

These agents search codebases and fetch docs. They do not make decisions — they find information and report back. Speed matters more than depth. This is where you save money.

Best: Haiku-class, GPT-4o-mini | Acceptable: Any mini/flash model | Overkill: Opus, GPT-5

Example Configurations

Anthropic-Only

jsonc
{
  "agents": {
    "loom":     { "model": "anthropic/claude-opus-4" },
    "pattern":  { "model": "anthropic/claude-opus-4" },
    "warp":     { "model": "anthropic/claude-opus-4" },
    "tapestry": { "model": "anthropic/claude-sonnet-4" },
    "weft":     { "model": "anthropic/claude-sonnet-4" },
    "shuttle":  { "model": "anthropic/claude-sonnet-4" },
    "thread":   { "model": "anthropic/claude-haiku-4" },
    "spindle":  { "model": "anthropic/claude-haiku-4" }
  }
}

OpenAI-Only

jsonc
{
  "agents": {
    "loom":     { "model": "openai/gpt-5" },
    "pattern":  { "model": "openai/gpt-5" },
    "warp":     { "model": "openai/gpt-5" },
    "tapestry": { "model": "openai/gpt-4o" },
    "weft":     { "model": "openai/gpt-4o" },
    "shuttle":  { "model": "openai/gpt-4o" },
    "thread":   { "model": "openai/gpt-4o-mini" },
    "spindle":  { "model": "openai/gpt-4o-mini" }
  }
}

Budget-Conscious (Sonnet Everywhere)

If Opus is too expensive, Sonnet-class models work across all tiers. You will notice weaker planning and security auditing, but it is a viable setup:

jsonc
{
  "agents": {
    "loom":     { "model": "anthropic/claude-sonnet-4" },
    "pattern":  { "model": "anthropic/claude-sonnet-4" },
    "warp":     { "model": "anthropic/claude-sonnet-4" },
    "tapestry": { "model": "anthropic/claude-sonnet-4" },
    "weft":     { "model": "anthropic/claude-sonnet-4" },
    "shuttle":  { "model": "anthropic/claude-sonnet-4" },
    "thread":   { "model": "anthropic/claude-haiku-4" },
    "spindle":  { "model": "anthropic/claude-haiku-4" }
  }
}

Start with Defaults

If you are using GitHub Copilot, the built-in defaults already map top/mid/economy tiers to the right models. Only override if the defaults are not available to you or you want to use a different provider.

Downgrade Impact

AgentRisk of Using a Weaker Model
LoomHigh — poor delegation decisions cascade through everything
PatternHigh — plans miss edge cases, bad task ordering
WarpHigh — missed security vulnerabilities
TapestryMedium — more code errors, less idiomatic output
WeftMedium — missed issues or false positives in review
ShuttleMedium — depends on task complexity
ThreadLow — search and read; fast models work great
SpindleLow — web research; fast models work great

Model Resolution

For full details on how Weave resolves models (config overrides, fallback chains, category models), see the Model Resolution Chain section in Agent Configuration.

Released under the MIT License.