多大模型:智能本地模型切换 - Openclaw Skills

作者:互联网

2026-04-16

AI教程

什么是 多大模型智能切换?

Multi-LLM 是一款强大的模型管理实用程序,旨在优化开发者与大语言模型的交互方式。虽然系统默认使用 Claude Opus 4.5 处理高复杂度任务,但它提供了一个专门的触发器来激活本地模型选择。这使得 Openclaw Skills 用户能够在代码编写、数学计算和翻译等特定领域利用本地硬件的性能,减少对外部 API 的依赖并提高隐私性。

通过集成此技能,您将获得一个能够理解请求上下文的动态执行环境。无论您是进行大规模重构还是简单的文本摘要,Multi-LLM 都能确保正确的模型用于正确的工作,并辅以先进的降级回退机制,即使特定的本地模型缺失也能保证可靠性。

下载入口:https://github.com/openclaw/skills/tree/main/skills/leohan123123/mlti-llm-fallback

安装与下载

1. ClawHub CLI

从源直接安装技能的最快方式。

npx clawhub@latest install mlti-llm-fallback

2. 手动安装

将技能文件夹复制到以下位置之一

全局模式 ~/.openclaw/skills/ 工作区 /skills/

优先级:工作区 > 本地 > 内置

3. 提示词安装

将此提示词复制到 OpenClaw 即可自动安装。

请帮我使用 Clawhub 安装 mlti-llm-fallback。如果尚未安装 Clawhub,请先安装(npm i -g clawhub)。

多大模型智能切换 应用场景

  • 通过将标准代码任务分流到 Qwen2.5-Coder 等本地模型来降低 API Token 消耗。
  • 使用 DeepSeek-R1 在本地运行隐私敏感的推理和逻辑分析。
  • 使用轻量化本地中文大模型加速翻译和摘要工作流。
  • 当 Openclaw Skills 环境中主模型不可用时,通过自动化回退链保持工作流连续性。
多大模型智能切换 工作原理
  1. 系统监控用户输入中是否包含特定触发命令 multi llm
  2. 如果没有触发器,系统会将请求路由到默认的高性能模型 Claude Opus 4.5。
  3. 当检测到触发器时,智能检测逻辑会扫描提示词中与代码、推理或语言相关的关键词。
  4. 该技能会将检测到的任务类型映射到优先排序的本地 Ollama 模型列表。
  5. 如果未找到首选本地模型(例如推理任务的 DeepSeek-R1),系统将遍历回退链以寻找下一个最佳可用本地替代方案。

多大模型智能切换 配置指南

要在您的 Openclaw Skills 设置中开始使用此技能,请确保已安装 Ollama 并拉取了所需的模型:

# 安装 Ollama
curl -fsSL https://ollama.com/install.sh | sh

# 启动 Ollama 服务
ollama serve

# 拉取推荐的模型集
ollama pull qwen2.5-coder:32b
ollama pull deepseek-r1:70b
ollama pull glm4:9b
ollama pull qwen3:32b

使用 ollama list 验证您的本地环境,确保所有模型都已准备好用于切换逻辑。

多大模型智能切换 数据架构与分类体系

该技能根据任务类别和预定义的模型层级组织模型选择:

任务类别 首选模型 大小 回退优先级
代码 qwen2.5-coder:32b 19GB qwen2.5-coder:14b -> qwen3:32b
推理 deepseek-r1:70b 42GB deepseek-r1:32b -> qwen3:32b
中文 glm4:9b 5.5GB qwen3:8b -> qwen3:32b
通用 qwen3:32b 20GB qwen3:14b -> qwen3:8b

内部逻辑通过位于 scripts 目录下的 select-model.shfallback-demo.sh 处理。

name: multi-llm
description: Multi-LLM intelligent switching. Use command 'multi llm' to activate local model selection based on task type. Default uses Claude Opus 4.5.
trigger: multi llm
version: 1.1.0
author: leohan123123
tags: llm, ollama, local-model, fallback, multi-model

Multi-LLM - Intelligent Model Switching

Trigger Command: multi llm

Default Behavior: Always use Claude Opus 4.5 (strongest model) Only when the message contains multi llm command will local model selection be activated.

What's New in v1.1.0

  • Renamed trigger from mlti llm to multi llm (clearer naming)
  • Enhanced model existence checking with fallback chain
  • Added detailed usage examples and troubleshooting
  • Improved task detection patterns

Usage

Default Mode (without command)

Help me write a Python function -> Uses Claude Opus 4.5
Analyze this code -> Uses Claude Opus 4.5

Multi-Model Mode (with command)

multi llm Help me write a Python function -> Selects qwen2.5-coder:32b
multi llm Analyze this math proof -> Selects deepseek-r1:70b
multi llm Translate to Chinese -> Selects glm4:9b

Command Format

Command Description
multi llm Activate intelligent model selection
multi llm coding Force coding model
multi llm reasoning Force reasoning model
multi llm chinese Force Chinese model
multi llm general Force general model

Model Mapping

Primary Model (Default): github-copilot/claude-opus-4.5

Local Models (when multi llm triggered):

Task Type Model Size Best For
Coding qwen2.5-coder:32b 19GB Code generation, debugging, refactoring
Reasoning deepseek-r1:70b 42GB Math, logic, complex analysis
Chinese glm4:9b 5.5GB Translation, summaries, quick tasks
General qwen3:32b 20GB General purpose, fallback

Fallback Chain

If the selected model is unavailable, the system tries alternatives:

Coding:    qwen2.5-coder:32b -> qwen2.5-coder:14b -> qwen3:32b
Reasoning: deepseek-r1:70b -> deepseek-r1:32b -> qwen3:32b
Chinese:   glm4:9b -> qwen3:8b -> qwen3:32b
General:   qwen3:32b -> qwen3:14b -> qwen3:8b

Detection Logic

User Input
    |
    v
Contains "multi llm"?
    |
    +-- No -> Use Claude Opus 4.5 (default)
    |
    +-- Yes -> Task Type Detection
                |
        +-------+-------+-------+
        v       v       v       v
      Coding  Reasoning Chinese General
        |       |       |       |
        v       v       v       v
    qwen2.5  deepseek  glm4   qwen3
    coder    r1:70b    :9b    :32b

Task Detection Keywords

Category Keywords (EN) Keywords (CN)
Coding code, debug, function, script, api, bug, refactor, python, java, javascript 代码, 编程, 函数, 调试, 重构
Reasoning analysis, proof, logic, math, solve, algorithm, evaluate 推理, 分析, 证明, 逻辑, 数学, 计算, 算法
Chinese translate, summary 翻译, 总结, 摘要, 简单, 快速

Examples

Example 1: Coding Task

# Input
multi llm Write a Python function to calculate fibonacci

# Output
Selected: qwen2.5-coder:32b
Reason: Detected coding task (keywords: python, function)

Example 2: Math Analysis

# Input
multi llm reasoning Prove that sqrt(2) is irrational

# Output
Selected: deepseek-r1:70b
Reason: Force command 'reasoning' used

Example 3: Quick Translation

# Input
multi llm 把这段话翻译成英文

# Output
Selected: glm4:9b
Reason: Detected Chinese lightweight task (keywords: 翻译)

Example 4: Default (No trigger)

# Input
Write a REST API with authentication

# Output
Selected: claude-opus-4.5
Reason: Default model (no 'multi llm' trigger)

Prerequisites

  1. Ollama must be installed and running:
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Start Ollama service
ollama serve

# Pull required models
ollama pull qwen2.5-coder:32b
ollama pull deepseek-r1:70b
ollama pull glm4:9b
ollama pull qwen3:32b
  1. Check available models:
ollama list

Troubleshooting

Model not found

# Check if model exists
ollama list | grep "qwen2.5-coder"

# Pull missing model
ollama pull qwen2.5-coder:32b

Ollama not running

# Check service status
curl -s http://localhost:11434/api/tags

# Start Ollama
ollama serve &

Slow response

  • Large models (70b) require significant RAM/VRAM
  • Consider using smaller variants: deepseek-r1:32b instead of 70b

Wrong model selected

  • Use force commands: multi llm coding, multi llm reasoning
  • Check if keywords match your task type

Files in This Skill

multi-llm/
├── SKILL.md              # This documentation
└── scripts/
    ├── select-model.sh   # Model selection logic
    └── fallback-demo.sh  # Interactive demo script

Integration

With OpenCode/ClaudeCode

The trigger multi llm is detected in your message. Simply prefix your request:

multi llm [your request here]

Programmatic Usage

# Get recommended model for a task
./scripts/select-model.sh "multi llm write a sorting algorithm"
# Output: qwen2.5-coder:32b

# Demo with actual model call
./scripts/fallback-demo.sh --force-local "explain recursion"

Author

  • GitHub: @leohan123123

License

MIT

相关推荐