Model Selection Guide

Weave assigns each agent a default model, but you can override them via Agent Configuration. This guide gives practical advice on which models work well for each agent role when the defaults are not available to you.

Agent Tiers

Weave's agents fall into three tiers based on how much reasoning power they need:

Tier	Agents	What They Do	What They Need
Top	Loom, Pattern, Warp	User-facing orchestration, planning, security auditing	The strongest reasoning you can afford. These agents make decisions that cascade through the entire workflow — bad judgment here means bad results everywhere.
Mid	Tapestry, Weft, Shuttle	Plan coordination, reviewing, specialist work	Good coding ability and comprehension. These agents coordinate plan execution, implement delegated tasks, review changes, and verify results — they don't need to be brilliant, but they can't be sloppy.
Economy	Thread, Spindle	Codebase search, web research	Speed over depth. These agents read files and search — they don't make architectural decisions. Fast, cheap models work great here.

Anthropic

Opus-class models (Opus 4, Opus 4.5, etc.) — use for Loom, Pattern, Warp

Opus models are the best fit for top-tier agents. They have the deepest reasoning, handle complex multi-step delegation well, and catch subtle issues in security auditing. If you can only afford Opus for one agent, prioritize Loom — it is the primary orchestrator and user-facing router, so bad decisions there affect everything downstream.

Sonnet-class models (Sonnet 4, Sonnet 4.5, etc.) — use for Tapestry, Weft, Shuttle

Sonnet is the sweet spot for execution and review. Fast enough to not slow you down, smart enough to write good code and catch real issues in review. Sonnet also works as a fallback for top-tier agents if Opus is too expensive — you will notice some quality degradation in complex planning, but it is workable.

Haiku-class models (Haiku 4, Haiku 4.5, etc.) — use for Thread, Spindle

Haiku is purpose-built for the economy tier. Thread and Spindle do high-volume, low-complexity work — reading files, searching code, fetching docs. Haiku handles this perfectly and keeps costs low. Do not use Haiku for mid-tier or top-tier agents — it will noticeably struggle with complex code generation and nuanced reasoning.

OpenAI

GPT-4o / GPT-5 — use for Loom, Pattern, Warp

These are OpenAI's strongest general-purpose models. Good reasoning, good function calling, good at following complex system prompts. Either works well for top-tier agents.

GPT-4o-mini / GPT-4.1-mini — use for Tapestry, Weft, Shuttle, Thread, Spindle

The mini models are fast and capable enough for execution work. They handle code generation, review, and search well. For economy agents (Thread, Spindle), they are slightly overpowered but work fine — there is not a cheaper OpenAI option with reliable tool calling.

GPT-4.1 — a solid mid-tier option for Tapestry, Weft, Shuttle

GPT-4.1 has strong coding ability and a 1M token context window. It is a good fit for Tapestry when working on large codebases where you need the model to hold a lot of context. It is not a reasoning model, so keep it in the mid tier.

o3 / o4-mini Reasoning Models

The o-series models are powerful reasoners but behave differently — they use internal chain-of-thought, can be slower, and do not always stream well. They can work for Pattern and Warp (where deep thinking helps), but test before committing. They are not ideal for Loom, which needs to be responsive and make quick delegation decisions.

Per-Agent Guidance

Loom (Primary Orchestrator) — needs the best you have

Loom is your main interface to Weave. It decides what to delegate, to whom, and in what order. It reads your request, breaks it into tasks, picks the right specialist, and writes their instructions. If Loom misunderstands your intent or makes a poor delegation choice, everything downstream suffers. This is not the place to save money.

Best: Opus-class, GPT-5 | Acceptable: Sonnet-class, GPT-4o | Avoid: Haiku, mini models

Pattern (Planner) — deep reasoning, architectural thinking

Pattern analyzes your codebase and produces detailed implementation plans. It needs to understand file dependencies, anticipate edge cases, and order tasks correctly. Weak models produce plans that look reasonable but fall apart during execution.

Best: Opus-class, GPT-5 | Acceptable: Sonnet-class, GPT-4o, o3 | Avoid: Haiku, mini models

Warp (Security Auditor) — skeptical, deep analysis

Warp looks for security vulnerabilities and spec violations. It needs to understand OAuth flows, JWT validation, CORS policies, and subtle injection vectors. Security auditing is one of the hardest tasks for a model — cheap models miss real issues and flag false positives.

Best: Opus-class, o3, GPT-5 | Acceptable: Sonnet-class, GPT-4o | Avoid: Haiku, mini models

Tapestry (Plan Execution Coordinator) — reliable `/start-work` execution

Tapestry coordinates /start-work plan execution step by step. It reads the plan, delegates each ready task to Shuttle, verifies the results, and tracks progress. It does not need to be highly creative — it needs to be reliable.

Best: Sonnet-class, GPT-4o | Acceptable: GPT-4.1, GPT-4o-mini | Avoid: Haiku

Weft (Reviewer) — balanced critical analysis

Weft reviews code and plans. It needs to spot real problems without drowning you in nitpicks. A model that is too weak will miss issues; a model that is too strong is wasted here (though not harmful).

Best: Sonnet-class, GPT-4o | Acceptable: GPT-4.1, GPT-4o-mini | Avoid: Haiku

Thread and Spindle (Explorers) — fast and cheap

These agents search codebases and fetch docs. They do not make decisions — they find information and report back. Speed matters more than depth. This is where you save money.

Best: Haiku-class, GPT-4o-mini | Acceptable: Any mini/flash model | Overkill: Opus, GPT-5

Example Configurations

Anthropic-Only

jsonc

{
  "agents": {
    "loom":     { "model": "anthropic/claude-opus-4" },
    "pattern":  { "model": "anthropic/claude-opus-4" },
    "warp":     { "model": "anthropic/claude-opus-4" },
    "tapestry": { "model": "anthropic/claude-sonnet-4" },
    "weft":     { "model": "anthropic/claude-sonnet-4" },
    "shuttle":  { "model": "anthropic/claude-sonnet-4" },
    "thread":   { "model": "anthropic/claude-haiku-4" },
    "spindle":  { "model": "anthropic/claude-haiku-4" }
  }
}

OpenAI-Only

jsonc

{
  "agents": {
    "loom":     { "model": "openai/gpt-5" },
    "pattern":  { "model": "openai/gpt-5" },
    "warp":     { "model": "openai/gpt-5" },
    "tapestry": { "model": "openai/gpt-4o" },
    "weft":     { "model": "openai/gpt-4o" },
    "shuttle":  { "model": "openai/gpt-4o" },
    "thread":   { "model": "openai/gpt-4o-mini" },
    "spindle":  { "model": "openai/gpt-4o-mini" }
  }
}

Budget-Conscious (Sonnet Everywhere)

If Opus is too expensive, Sonnet-class models work across all tiers. You will notice weaker planning and security auditing, but it is a viable setup:

jsonc

{
  "agents": {
    "loom":     { "model": "anthropic/claude-sonnet-4" },
    "pattern":  { "model": "anthropic/claude-sonnet-4" },
    "warp":     { "model": "anthropic/claude-sonnet-4" },
    "tapestry": { "model": "anthropic/claude-sonnet-4" },
    "weft":     { "model": "anthropic/claude-sonnet-4" },
    "shuttle":  { "model": "anthropic/claude-sonnet-4" },
    "thread":   { "model": "anthropic/claude-haiku-4" },
    "spindle":  { "model": "anthropic/claude-haiku-4" }
  }
}

Start with Defaults

If you are using GitHub Copilot, the built-in defaults already map top/mid/economy tiers to the right models. Only override if the defaults are not available to you or you want to use a different provider.

Downgrade Impact

Agent	Risk of Using a Weaker Model
Loom	High — poor delegation decisions cascade through everything
Pattern	High — plans miss edge cases, bad task ordering
Warp	High — missed security vulnerabilities
Tapestry	Medium — weaker coordination, poorer verification, more execution drift
Weft	Medium — missed issues or false positives in review
Shuttle	Medium — depends on task complexity
Thread	Low — search and read; fast models work great
Spindle	Low — web research; fast models work great

Model Resolution

For full details on how Weave resolves models (config overrides, fallback chains, category models), see the Model Resolution Chain section in Agent Configuration.

Model Selection Guide ​

Agent Tiers ​

Anthropic ​

OpenAI ​

Per-Agent Guidance ​

Loom (Primary Orchestrator) — needs the best you have ​

Pattern (Planner) — deep reasoning, architectural thinking ​

Warp (Security Auditor) — skeptical, deep analysis ​

Tapestry (Plan Execution Coordinator) — reliable /start-work execution ​

Weft (Reviewer) — balanced critical analysis ​

Thread and Spindle (Explorers) — fast and cheap ​

Example Configurations ​

Anthropic-Only ​

OpenAI-Only ​

Budget-Conscious (Sonnet Everywhere) ​

Downgrade Impact ​

Model Selection Guide

Agent Tiers

Anthropic

OpenAI

Per-Agent Guidance

Loom (Primary Orchestrator) — needs the best you have

Pattern (Planner) — deep reasoning, architectural thinking

Warp (Security Auditor) — skeptical, deep analysis

Tapestry (Plan Execution Coordinator) — reliable `/start-work` execution

Weft (Reviewer) — balanced critical analysis

Thread and Spindle (Explorers) — fast and cheap

Example Configurations

Anthropic-Only

OpenAI-Only

Budget-Conscious (Sonnet Everywhere)

Downgrade Impact