Token Saver: 面向 Openclaw Skills 的模型感知上下文优化工具

作者:互联网

2026-03-20

AI教程

什么是 Token Saver?

Token Saver 是一款先进的实用工具,旨在通过最小化 Token 开销来最大化 AI 代理交互效率。每当代理进行通信时,它都会传输占用上下文窗口的各种工作区文件;该技能可智能管理这些数据,以防止上下文膨胀和 API 成本上升。它在 Openclaw Skills 中脱颖而出,因为它具有模型感知能力,这意味着它会根据正在使用的模型(无论是 Claude、GPT-4o 还是 Gemini)的具体上下文窗口调整其优化策略。

该技能提供了一套多层级的效率方案,涵盖从特定文件的压缩算法到动态聊天精简。通过识别人格驱动文件(如 SOUL.md)与数据密集型文件(如 MEMORY.md)之间的差异,它会应用适当的密度级别,在剔除不必要的 Token 使用的同时,确保核心推理能力不受影响。对于希望在不大幅增加支出的情况下扩展 AI 工作流的开发人员来说,这是一个至关重要的工具。

下载入口:https://github.com/openclaw/skills/tree/main/skills/rubenaquispe/token-saver

安装与下载

1. ClawHub CLI

从源直接安装技能的最快方式。

npx clawhub@latest install token-saver

2. 手动安装

将技能文件夹复制到以下位置之一

全局模式 ~/.openclaw/skills/ 工作区 /skills/

优先级:工作区 > 本地 > 内置

3. 提示词安装

将此提示词复制到 OpenClaw 即可自动安装。

请帮我使用 Clawhub 安装 token-saver。如果尚未安装 Clawhub,请先安装(npm i -g clawhub)。

Token Saver 应用场景

  • 在保持 API 支出在固定预算内的同时扩展 AI 代理工作流。
  • 在处理大型项目文件或冗长的聊天记录时,防止上下文窗口溢出。
  • 统一不同 AI 模型之间的内存和用户数据密度。
  • 通过实时仪表板审计上下文使用情况,识别高 Token 消耗的文件。
Token Saver 工作原理
  1. 该技能通过强大的回退链识别活跃的 AI 模型,检查运行时标志、环境变量或本地配置文件。
  2. 它对当前工作区进行 Token 审计,衡量 SOUL.md、USER.md 和 AGENTS.md 等文件的影响。
  3. 根据模型的特定上下文窗口(例如 Claude 为 200K,Gemini 为 1M),它会建议或应用动态精简预设。
  4. 使用针对性压缩优化文件;例如,将 MEMORY.md 转换为密集键值格式,同时保留 PROJECT.md 的结构。
  5. 该工具维护一个模型注册表以准确计算成本和使用百分比,为用户提供全面的优化仪表板。

Token Saver 配置指南

要将此工具添加到您的环境中,请使用 Openclaw Skills 的标准安装命令:

clawhub install token-saver --registry "https://www.clawhub.ai"

Token Saver 数据架构与分类体系

Token Saver 按以下方式组织其优化数据和安全备份:

组件 描述
scripts/models.json 包含 24 种以上 AI 模型的上下文限制和定价的综合注册表。
*.backup 在进行任何压缩或精简之前创建的自动文件快照。
openclaw.json 用于模型检测和持久效率设置的本地配置文件。
MEMORY.md 优化为超密集键:值格式的数据密集型文件。
name: token-saver
version: 3.0.0
description: "Reduce OpenClaw AI costs with model-aware optimization. Features dynamic compaction presets based on your model's context window, intelligent file compression, and robust model detection with fallback. Supports Claude, GPT-4, Gemini, DeepSeek, and more."

Token Saver v3

?? Did you know? Every API call sends your workspace files (SOUL.md, USER.md, MEMORY.md, AGENTS.md, etc.) along with your message. These files count toward your context window, slowing responses and costing real money on every message.

Token Saver v3 is model-aware — it knows your model's context window and adapts recommendations accordingly. Using Gemini's 1M context? Presets scale up. On GPT-4o's 128K? Presets adjust down.

What's New in v3

Feature v2 v3
Compaction presets Fixed (80K/120K/160K) Dynamic (% of model's context)
Model detection Fragile, env-only Robust fallback chain
Context windows Not tracked Full registry (9 models)
Model info Hardcoded pricing JSON registry, easy updates
Already-optimized Re-compressed Smart bypass

Commands

Command What it does
/optimize Full dashboard — files, models, context usage %
/optimize tokens Compress workspace files (auto-backup)
/optimize compaction Chat compaction control (model-aware)
/optimize compaction balanced Apply balanced preset (60% of context)
/optimize compaction 120 Custom threshold (compact at 120K)
/optimize models Detailed model audit with registry
/optimize revert Restore backups, disable persistent mode

Features

?? Model-Aware Dashboard

Shows current model, context window, and usage percentage:

?? Model: Claude Opus 4.5 (200K context)
   Detected: openclaw.json

?? Context Usage: [████████????????????] 42% (84K/200K)

?? Workspace File Compression

Scans all .md files, shows token count and potential savings. Smart bypass skips already-optimized files.

File-aware compression:

  • SOUL.md — Light compression, keeps personality language
  • AGENTS.md — Medium compression, dense instructions
  • USER.md / MEMORY.md — Heavy compression, key:value format
  • PROJECTS.md — No compression (user structure preserved)

?? Dynamic Compaction Presets

Presets adapt to your model's context window:

Preset % of Context Claude 200K GPT-4o 128K Gemini 1M
Aggressive 40% 80K 51K 400K
Balanced 60% 120K 77K 600K
Conservative 80% 160K 102K 800K
Off 95% 190K 122K 950K

?? Model Registry

24+ models with context windows, pricing, and aliases:

  • Claude: Opus 4.6 (1M), Opus 4.5, Sonnet 4.5, Sonnet 4, Haiku 4.5, Haiku 3.5 (200K)
  • OpenAI: GPT-5.2, GPT-5.1, GPT-5-mini, GPT-5-nano (256K), GPT-4.1, GPT-4o (128K), o1, o3, o4-mini
  • Gemini: 3 Pro (2M), 2.5 Pro, 2.0 Flash (1M)
  • Others: DeepSeek V3 (64K), Kimi K2.5 (128K), Llama 3.3 70B, Mistral Large

?? Robust Model Detection

Detection priority:

  1. Runtime injection (--model=...)
  2. Environment variables (SKILL_MODEL, OPENCLAW_MODEL)
  3. Config file (~/.openclaw/openclaw.json)
  4. File inference (TOOLS.md, MEMORY.md mentions)
  5. Fallback: Claude Sonnet 4 (safe default)

Unknown model handling:

  • Strict version matching — opus-6.5 won't fuzzy-match to opus-4.5
  • Unknown models get safe defaults (200K context) + warning
  • Easy to add new models to scripts/models.json

?? Persistent Mode

Adds writing guidance to AGENTS.md for continued token efficiency:

File Writing Style
SOUL.md Evocative, personality-shaping
AGENTS.md Dense instructions, symbols OK
USER.md Key:value facts
MEMORY.md Ultra-dense data

Safety

  • Auto-backup — All modified files get .backup extension
  • Integrity > Size — Never sacrifices meaning for smaller tokens
  • Smart bypass — Skips already-optimized files
  • Revert anytime/optimize revert restores everything
  • No external calls — All analysis runs locally

Installation

clawhub install token-saver --registry "https://www.clawhub.ai"

Version History

  • 3.0.0 — Model registry, dynamic presets, robust detection, smart bypass
  • 2.0.1 — Chat compaction, file-aware compression, persistent mode
  • 1.0.0 — Initial release