Switchboard:优化 AI 智能体模型路由成本 - Openclaw Skills
作者:互联网
2026-03-30
什么是 Switchboard?
Switchboard 是一个专为提高 AI 智能体效率而设计的复杂路由层。由于约 80% 的智能体任务是文件读取或格式化等常规操作,该技能可以防止在简单任务上过度消耗顶级模型。它利用 OpenRouter 提供从零成本选项到高推理前沿模型的多个模型层级访问权限。
将此逻辑集成到 Openclaw Skills 中,使开发人员能够通过战略性的模型层级结构保持高性能,同时将每月 API 支出降低多达 10 倍。它确保将昂贵的推理能力保留给复杂的调试和架构决策,而琐碎任务则由更快、更便宜的替代方案处理。
下载入口:https://github.com/openclaw/skills/tree/main/skills/gigabit-eth/router
安装与下载
1. ClawHub CLI
从源直接安装技能的最快方式。
npx clawhub@latest install router
2. 手动安装
将技能文件夹复制到以下位置之一
全局模式~/.openclaw/skills/
工作区
/skills/
优先级:工作区 > 本地 > 内置
3. 提示词安装
将此提示词复制到 OpenClaw 即可自动安装。
请帮我使用 Clawhub 安装 router。如果尚未安装 Clawhub,请先安装(npm i -g clawhub)。
Switchboard 应用场景
- 根据成本和推理需求决定针对特定子任务使用哪种模型。
- 为 URL 获取或状态检查等高吞吐量后台任务生成子智能体。
- 当标准模型处理任务失败时,自动升级到高级推理模型。
- 在不导致 API 成本呈指数级增长的情况下扩展智能体运营。
- 该技能使用调试、总结或读取等分类信号分析任务复杂度。
- 它应用工具支持要求和估计 Token 长度等硬约束来过滤可用模型。
- 决策算法在满足任务标准的适当层级(免费、便宜、中等或高级)中选择最便宜的模型。
- 任务通过主会话或生成的子智能体执行,并在 Openclaw Skills 内部通过 OpenRouter 集成调用所选模型。
Switchboard 配置指南
要使用 Switchboard,您必须配置 OpenRouter API 密钥。将其添加到您的配置文件中:
# 编辑 ~/.openclaw/openclaw.json
{
"openrouter_api_key": "sk-or-v1-..."
}
您还可以在 config.yml 文件中设置默认会话模型:
# config.yml
model: anthropic/claude-sonnet-4
Switchboard 数据架构与分类体系
Switchboard 根据定价和能力将模型组织为四个层级,供 Openclaw Skills 使用:
| 层级 | 类别 | 价格范围 | 主要用例 |
|---|---|---|---|
| Tier 0 | 免费 | $0.00 | 非关键后台任务 |
| Tier 1 | 便宜 | $0.02 - $0.50/M | 常规智能体循环和文件操作 |
| Tier 2 | 中等 | $1.00 - $5.00/M | 通用交互工作和代码生成 |
| Tier 3 | 高级 | $5.00+/M | 复杂推理和深度调试 |
name: switchboard
description: >
Cost-optimize AI agent operations by routing tasks to appropriate models based on complexity.
Use this skill when: (1) deciding which model to use for a task, (2) spawning sub-agents,
(3) considering cost efficiency, (4) the current model feels like overkill for the task.
Triggers: "model routing", "cost optimization", "which model", "too expensive", "spawn agent",
"cheap model", "expensive", "tier 1", "tier 2", "tier 3".
SwitchBoard
Route tasks to the cheapest model that can handle them. Most agent work is routine.
Prerequisites
This skill requires an OpenRouter API key for model routing. Add it to your OpenClaw user config:
// ~/.openclaw/openclaw.json
{
"openrouter_api_key": "sk-or-v1-..."
}
Without this key, /model switching and sessions_spawn with non-default models will fail. Get a key at openrouter.ai/keys.
Privacy Note: Some models listed in this skill (e.g., Aurora Alpha, Free Router) may log prompts and completions for provider training. Do not route sensitive data (API keys, passwords, private PII) through free or unmoderated models. Review model privacy policies at openrouter.ai/docs before use.
Core Principle
80% of agent tasks are janitorial. File reads, status checks, formatting, simple Q&A. These don't need expensive models. Reserve premium models for problems that actually require deep reasoning.
Model Tiers
For OpenRouter-specific pricing and models, see references/openrouter-models.md.
Tier 0: Free
| Model | Context | Tools | Best For |
|---|---|---|---|
| Aurora Alpha | 128K | ? | Zero-cost reasoning, cloaked community model |
| Free Router | 200K | ? | Auto-routes to best available free model |
| Step 3.5 Flash (free) | 256K | ? | Long-context reasoning at zero cost |
Free models have rate limits and variable availability. Good for non-critical background tasks.
Tier 1: Cheap ($0.02-0.50/M tokens)
| Model | Input | Output | Context | Tools | Best For |
|---|---|---|---|---|---|
| Qwen3 Coder Next | $0.07 | $0.30 | 262K | ? | Agentic coding, MoE 80B/3B active |
| Gemini 2.0 Flash Lite | $0.07 | $0.30 | 1M | ? | High volume, massive context |
| Gemini 2.0 Flash | $0.10 | $0.40 | 1M | ? | General routine with long context |
| GPT-4o-mini | $0.15 | $0.60 | 128K | ? | Quick responses, reliable tool use |
| DeepSeek Chat | $0.30 | $1.20 | 164K | ? | General routine work |
| Claude 3 Haiku | $0.25 | $1.25 | 200K | ? | Fast tool use, structured output |
| Kimi K2.5 | $0.45 | $2.20 | 262K | ? | Multimodal, visual coding, agentic |
Tier 2: Mid ($1-5/M tokens)
| Model | Input | Output | Context | Tools | Best For |
|---|---|---|---|---|---|
| o3-mini | $1.10 | $4.40 | 200K | ? | Reasoning on a budget |
| Gemini 2.5 Pro | $1.25 | $10.00 | 1M | ? | Long context, large codebase work |
| GPT-4o | $2.50 | $10.00 | 128K | ? | Multimodal tasks |
| Claude Sonnet | $3.00 | $15.00 | 1M | ? | Balanced performance, agentic |
Tier 3: Premium ($5+/M tokens)
| Model | Input | Output | Context | Tools | Best For |
|---|---|---|---|---|---|
| Claude Opus 4.6 | $5.00 | $25.00 | 1M | ? | Complex reasoning, deep context |
| o1 | $15.00 | $60.00 | 200K | ? | Multi-step reasoning |
| GPT-4.5 | $75.00 | $150.00 | 128K | ? | Frontier tasks |
Prices as of Feb 2026. Check provider docs for current rates. Context = max context window. Tools = function calling support.
Task Classification
Before executing any task, classify it:
ROUTINE → Use Tier 1
Characteristics:
- Single-step operations
- Clear, unambiguous instructions
- No judgment required
- Deterministic output expected
Examples:
- File read/write operations
- Status checks and health monitoring
- Simple lookups (time, weather, definitions)
- Formatting and restructuring text
- List operations (filter, sort, transform)
- API calls with known parameters
- Heartbeat and cron tasks
- URL fetching and basic parsing
MODERATE → Use Tier 2
Characteristics:
- Multi-step but well-defined
- Some synthesis required
- Standard patterns apply
- Quality matters but isn't critical
Examples:
- Code generation (standard patterns)
- Summarization and synthesis
- Draft writing (emails, docs, messages)
- Data analysis and transformation
- Multi-file operations
- Tool orchestration
- Code review (non-security)
- Search and research tasks
COMPLEX → Use Tier 3
Characteristics:
- Novel problem solving required
- Multiple valid approaches
- Nuanced judgment calls
- High stakes or irreversible
- Previous attempts failed
Examples:
- Multi-step debugging
- Architecture and design decisions
- Security-sensitive code review
- Tasks where cheaper model already failed
- Ambiguous requirements needing interpretation
- Long-context reasoning (>50K tokens)
- Creative work requiring originality
- Adversarial or edge-case handling
Decision Algorithm
function selectModel(task):
# Rule 1: Escalation override
if task.previousAttemptFailed:
return nextTierUp(task.previousModel)
# Rule 2: Hard constraints (filter before cost)
candidates = ALL_MODELS
if task.requiresToolUse:
candidates = candidates.filter(m => m.supportsTools)
if task.estimatedTokens > 128_000:
candidates = candidates.filter(m => m.contextWindow >= task.estimatedTokens)
if task.requiresMultimodal:
candidates = candidates.filter(m => m.supportsImages)
# Rule 3: Latency constraint
if task.isRealTime or task.inAgentLoop:
candidates = candidates.filter(m => m.latencyTier <= "fast")
# Rule 4: Complexity classification
if task.hasSignal("debug", "architect", "design", "security"):
return cheapestIn(candidates, TIER_3)
if task.hasSignal("summarize", "analyze", "refactor"):
return cheapestIn(candidates, TIER_2)
complexity = classifyTask(task)
if complexity == ROUTINE:
return cheapestIn(candidates, TIER_1)
elif complexity == MODERATE:
return cheapestIn(candidates, TIER_2)
else:
return cheapestIn(candidates, TIER_3)
Note: "write", "read", "code" alone are poor routing signals —
"write a file"is Tier 1: work, not Tier 2. Classify based on the task structure, not individual keywords.
Latency Considerations
Cost isn't the only axis. For real-time agent loops, latency matters:
| Tier | Typical TTFT | Throughput | Use When |
|---|---|---|---|
| Free | 1-5s | Variable | Background tasks, not time-sensitive |
| Tier 1 | 200-800ms | 50-100 tok/s | Agent loops, real-time pipelines |
| Tier 2 | 500ms-2s | 30-80 tok/s | Interactive sessions, async work |
| Tier 3 | 1-10s | 10-40 tok/s | One-shot complex tasks, async only |
TTFT = Time To First Token. Reasoning models (o1, o3-mini) have high TTFT due to thinking time but are worth it for hard problems.
Rule of thumb: If the agent is waiting in a loop for a response before the next action, use Tier 1. If the task is fire-and-forget, cost matters more than speed.
Behavioral Rules
For Main Session
- Default to Tier 2 for interactive work
- Suggest downgrade when doing routine work: "This is routine - I can handle this on a cheaper model or spawn a sub-agent."
- Request upgrade when stuck: "This needs more reasoning power. Switching to [premium model]."
For Sub-Agents
- Default to Tier 1 unless task is clearly moderate+
- Batch similar tasks to amortize overhead
- Report failures back to parent for escalation
- Check context window limits before dispatching — don't send 200K tokens to a 32K model
For Automated Tasks
- Heartbeats/monitoring → Always Tier 1 (or Free if available)
- Scheduled reports → Tier 1 or 2 based on complexity
- Alert responses → Start Tier 2, escalate if needed
- Background data fetching → Free tier when non-critical
Communication Patterns
When suggesting model changes, use clear language:
Downgrade suggestion:
"This looks like routine file work. Want me to spawn a sub-agent on DeepSeek for this? Same result, fraction of the cost."
Upgrade request:
"I'm hitting the limits of what I can figure out here. This needs Opus-level reasoning. Switching up."
Explaining hierarchy:
"I'm running the heavy analysis on Sonnet while sub-agents fetch the data on DeepSeek. Keeps costs down without sacrificing quality where it matters."
Cost Impact
Assuming 100K tokens/day average usage:
| Strategy | Monthly Cost | Notes |
|---|---|---|
| Pure Opus 4.6 | ~$75 | Maximum capability, lower than old Opus |
| Pure Sonnet | ~$45 | Good default for most work |
| Pure DeepSeek | ~$9 | Cheap but limited on hard problems |
| Pure Qwen3 Coder | ~$2 | Cheapest viable for coding agents |
| Hierarchy (80/15/5) | ~$12 | Best of all worlds |
| With Free tier (85/10/4/1) | ~$8 | Aggressive optimization |
The 80/15/5 split:
- 80% routine tasks on Tier 1 (~$4)
- 15% moderate tasks on Tier 2 (~$5)
- 5% complex tasks on Tier 3 (~$3)
Result: 6-10x cost reduction vs pure premium, with equivalent quality on complex tasks.
OpenClaw Integration
Session Model Switching
# config.yml - set your default session model
model: anthropic/claude-sonnet-4
# Mid-session, switch down for routine work
/model deepseek/deepseek-ch@t
# Switch up when you hit a wall
/model anthropic/claude-opus-4
Spawning Sub-Agents
# Batch routine tasks on cheap models
sessions_spawn:
task: "Fetch and parse these 50 URLs"
model: deepseek/deepseek-ch@t
# Use Qwen3 Coder for file-heavy agent work
sessions_spawn:
task: "Refactor these test files to use the new helper"
model: qwen/qwen3-coder-next
# Free tier for non-critical background jobs
sessions_spawn:
task: "Check health of all endpoints and log status"
model: openrouter/free
Recommended OpenClaw Defaults
| Task Type | Model | Why |
|---|---|---|
| Main interactive session | claude-sonnet-4 |
Best balance of quality and cost |
| File ops, fetches, formatting | deepseek/deepseek-ch@t |
Cheap, reliable |
| Agentic coding sub-tasks | qwen/qwen3-coder-next |
$0.07/M, 262K context, tool use |
| Background monitoring | openrouter/free |
Zero cost |
| Stuck / complex debugging | anthropic/claude-opus-4 |
Escalate only when needed |
Anti-Patterns
DON'T:
- Leave your session on Opus when the task is clearly routine —
/model deepseekexists for a reason - Spawn sub-agents without specifying a model — they inherit the session model, which is usually Tier 2
- Use Tier 3 for
sessions_spawntasks like file parsing, URL fetching, or status checks - Forget context window limits — spawning a 200K-token task on a 32K model will silently truncate
- Run recurring or scheduled tasks on anything above Tier 1
DO:
- Set
model: anthropic/claude-sonnet-4as yourconfig.ymldefault — good baseline - Always set an explicit
modelfield insessions_spawn— default todeepseek/deepseek-ch@torqwen/qwen3-coder-next /modelswitch down the moment you realize the current task is janitorial/modelswitch up the moment you're stuck — don't waste tokens retrying on a weak model- Use
openrouter/freefor fire-and-forget background checks
Extending This Skill
Optimize your switchboard over time:
- Track your actual spend — review your OpenRouter dashboard weekly to see which models are burning tokens
- Add your own routing signals — if your workflow has domain terms (e.g., "settlement", "pricing", "vault"), map them to tiers
- Tune the 80/15/5 split — if you find yourself escalating more than 5% of tasks, your classification may be too aggressive
- Pin model versions — when a cheap model works well for you, pin the version (e.g.,
deepseek/deepseek-ch@t-v3.1) so provider updates don't break your flow - Set OpenRouter budget alerts — catch runaway premium usage before it compounds
相关推荐
专题
+ 收藏
+ 收藏
+ 收藏
+ 收藏
+ 收藏
最新数据
相关文章
信号管道:自动化营销情报工具 - Openclaw Skills
技能收益追踪器:监控 Openclaw 技能并实现变现
AI 合规准备就绪度:评估与治理工具 - Openclaw Skills
FOSMVVM ServerRequest 测试生成器:自动化 API 测试 - Openclaw Skills
酒店搜索器:AI 赋能的住宿与位置情报 - Openclaw Skills
Dub 链接 API:程序化链接管理 - Openclaw Skills
IntercomSwap:P2P BTC 与 USDT 跨链兑换 - Openclaw Skills
spotplay:macOS 原生 Spotify 播放控制 - Openclaw Skills
DeepSeek OCR:AI驱动的图像文本识别 - Openclaw Skills
Web Navigator:自动化网页研究与浏览 - Openclaw Skills
AI精选
