同行评审:AI 智能体的本地 LLM 批判层 - Openclaw Skills

作者:互联网

2026-03-29

AI教程

什么是 同行评审 (Peer Review)?

同行评审 (Peer Review) 技能是 AI 驱动工作流的关键质量保证层。通过 Ollama 利用本地 LLM 推理,它实现了一种扇出架构,其中多个模型(如 Mistral 和 Llama 3.1)独立地对 Claude 等云端模型的输出进行批判。这种方法允许使用 Openclaw Skills 的开发人员在幻觉、逻辑不一致和过度自信的断言影响生产环境之前将其捕捉。

此技能专为准确性至关重要的极高要求场景而设计。它不仅仅提供第二意见;它还编排了一群本地评审员,通过共识逻辑汇总他们的发现。这通过提供透明、本地化且具有成本效益的方法来验证复杂的技术分析和创意内容,显著提高了智能体输出的可靠性。

下载入口:https://github.com/openclaw/skills/tree/main/skills/staybased/peer-review

安装与下载

1. ClawHub CLI

从源直接安装技能的最快方式。

npx clawhub@latest install peer-review

2. 手动安装

将技能文件夹复制到以下位置之一

全局模式 ~/.openclaw/skills/ 工作区 /skills/

优先级:工作区 > 本地 > 内置

3. 提示词安装

将此提示词复制到 OpenClaw 即可自动安装。

请帮我使用 Clawhub 安装 peer-review。如果尚未安装 Clawhub,请先安装(npm i -g clawhub)。

同行评审 (Peer Review) 应用场景

  • 验证云端智能体生成的高风险金融交易分析或技术报告。
  • 在发布前审查自动化内容流水线中的智能体输出质量。
  • 针对云端基准测试本地模型的准确性和推理能力。
  • 在长篇生成的文本中检测虚假引用或不存在的来源。
  • 为执行多步逻辑推理的自主智能体提供安全保障。
同行评审 (Peer Review) 工作原理
  1. 主要云端模型(如 Claude)生成初始分析或响应。
  2. 该技能触发扇出过程,将输入文本发送到三个不同的本地模型:Drift (Mistral 7B)、Pip (TinyLlama) 和 Lume (Llama 3.1)。
  3. 每个模型都扮演怀疑论评审员的角色,根据专门的提示词分析文本的事实、逻辑和结构错误。
  4. 批判意见以结构化 JSON 形式返回,识别特定的引用、问题类别和置信水平。
  5. 聚合脚本合成结果,去重标记并按模型置信度加权以确定共识。
  6. 生成最终报告,并提供发布、修改或人工干预标记的建议。

同行评审 (Peer Review) 配置指南

要将其集成到您的 Openclaw Skills 库中,请确保已安装 Ollama 并在本地拉取了所需的模型。

# 通过 Ollama 拉取所需的本地模型
ollama pull mistral:7b
ollama pull tinyllama:1.1b
ollama pull llama3.1:8b

# 确保依赖项可用
sudo apt-get install jq curl

# 对单个文档运行同行评审
bash scripts/peer-review.sh  [output_dir]

同行评审 (Peer Review) 数据架构与分类体系

同行评审技能为每次批判生成结构化元数据,以促进自动化决策。

属性 类型 描述
category 字符串 分类(事实、逻辑、缺失、过度自信、虚假来源)
quote 字符串 被标记的源文本中的特定摘录
issue 字符串 对识别出的错误或疑虑的详细解释
confidence 整数 模型的确定性评分,范围从 0 到 100

所有评审日志和性能指标都存储在 experiments/peer-review-results/ 目录中,以便长期跟踪真阳性率 (TPR)。

name: peer-review
description: |
  Multi-model peer review layer using local LLMs via Ollama to catch errors in cloud model output.
  Fan-out critiques to 2-3 local models, aggregate flags, synthesize consensus.

  Use when: validating trade analyses, reviewing agent output quality, testing local model accuracy,
  checking any high-stakes Claude output before publishing or acting on it.

  Don't use when: simple fact-checking (just search the web), tasks that don't benefit from
  multi-model consensus, time-critical decisions where 60s latency is unacceptable,
  reviewing trivial or low-stakes content.

  Negative examples:
  - "Check if this date is correct" → No. Just web search it.
  - "Review my grocery list" → No. Not worth multi-model inference.
  - "I need this answer in 5 seconds" → No. Peer review adds 30-60s latency.

  Edge cases:
  - Short text (<50 words) → Models may not find meaningful issues. Consider skipping.
  - Highly technical domain → Local models may lack domain knowledge. Weight flags lower.
  - Creative writing → Factual review doesn't apply well. Use only for logical consistency.
version: "1.0"

Peer Review — Local LLM Critique Layer

Hypothesis: Local LLMs can catch ≥30% of real errors in cloud output with <50% false positive rate.


Architecture

Cloud Model (Claude) produces analysis
        │
        ▼
┌────────────────────────┐
│   Peer Review Fan-Out  │
├────────────────────────┤
│  Drift (Mistral 7B)   │──? Critique A
│  Pip (TinyLlama 1.1B) │──? Critique B
│  Lume (Llama 3.1 8B)  │──? Critique C
└────────────────────────┘
        │
        ▼
  Aggregator (consensus logic)
        │
        ▼
  Final: original + flagged issues

Swarm Bot Roles

Bot Model Role Strengths
Drift ?? Mistral 7B Methodical analyst Structured reasoning, catches logical gaps
Pip ?? TinyLlama 1.1B Fast checker Quick sanity checks, low latency
Lume ?? Llama 3.1 8B Deep thinker Nuanced analysis, catches subtle issues

Scripts

Script Purpose
scripts/peer-review.sh Send single input to all models, collect critiques
scripts/peer-review-batch.sh Run peer review across a corpus of samples
scripts/seed-test-corpus.sh Generate seeded error corpus for testing

Usage

# Single file review
bash scripts/peer-review.sh  [output_dir]

# Batch review
bash scripts/peer-review-batch.sh  [results_dir]

# Generate test corpus
bash scripts/seed-test-corpus.sh [count] [output_dir]

Scripts live at workspace/scripts/ — not bundled in skill to avoid duplication.


Critique Prompt Template

You are a skeptical reviewer. Analyze the following text for errors.

For each issue found, output JSON:
{"category": "factual|logical|missing|overconfidence|hallucinated_source",
 "quote": "...", "issue": "...", "confidence": 0-100}

If no issues found, output: {"issues": []}

TEXT:
---
{cloud_output}
---

Error Categories

Category Description Example
factual Wrong numbers, dates, names "Bitcoin launched in 2010"
logical Non-sequiturs, unsupported conclusions "X is rising, therefore Y will fall"
missing Important context omitted Ignoring a major counterargument
overconfidence Certainty without justification "This will definitely happen" on 55% event
hallucinated_source Citing nonexistent sources "According to a 2024 Reuters report..."

Discord Workflow

  1. Post analysis to #the-deep (or #swarm-lab)
  2. Drift, Pip, and Lume respond with independent critiques
  3. Celeste synthesizes: deduplicates flags, weights by model confidence
  4. If consensus (≥2 models agree) → flag is high-confidence
  5. Final output posted with recommendation: publish | revise | flag_for_human

Success Criteria

Outcome TPR FPR Decision
Strong pass ≥50% <30% Ship as default layer
Pass ≥30% <50% Ship as opt-in layer
Marginal 20–30% 50–70% Iterate on prompts, retest
Fail <20% >70% Abandon approach

Scoring Rules

  • Flag = true positive if it identifies a real error (even if explanation is imperfect)
  • Flag = false positive if flagged content is actually correct
  • Duplicate flags across models count once for TPR but inform consensus metrics

Dependencies

  • Ollama running locally with models pulled: mistral:7b, tinyllama:1.1b, llama3.1:8b
  • jq and curl installed
  • Results stored in experiments/peer-review-results/

Integration

When peer review passes validation:

  • Package as Reef API endpoint: POST /review
  • Agents call before publishing any analysis
  • Configurable: model selection, consensus threshold, categories
  • Log all reviews to #reef-logs with TPR tracking