Calibre 元数据应用:自动更新电子书库 - Openclaw Skills

作者:互联网

2026-03-29

AI教程

什么是 Calibre 元数据应用?

Calibre Metadata Apply 技能是专为 Openclaw Skills 生态系统设计的专业实用程序,用于高精度管理和更新电子书元数据。它使用 calibredb CLI 与 Calibre 内容服务器交互,允许用户对书名、作者、丛书等进行受控编辑。该工具对于维护元数据一致性和准确性至关重要的庞大书库特别有效。

通过利用 Openclaw Skills 提供的框架,该工具确保元数据更新不仅是自动化的,而且是经过验证的。它采用多阶段工作流程,包括目标确认、从文件内容收集证据以及强制性的模拟运行(dry-run)阶段。这种方法最大限度地降低了数据库损坏的风险,并确保仅将用户批准的更改应用于远程库。

下载入口:https://github.com/openclaw/skills/tree/main/skills/nextaltair/calibre-metadata-apply

安装与下载

1. ClawHub CLI

从源直接安装技能的最快方式。

npx clawhub@latest install calibre-metadata-apply

2. 手动安装

将技能文件夹复制到以下位置之一

全局模式 ~/.openclaw/skills/ 工作区 /skills/

优先级:工作区 > 本地 > 内置

3. 提示词安装

将此提示词复制到 OpenClaw 即可自动安装。

请帮我使用 Clawhub 安装 calibre-metadata-apply。如果尚未安装 Clawhub,请先安装(npm i -g clawhub)。

Calibre 元数据应用 应用场景

  • 批量更新现有 Calibre 记录中缺失的元数据字段,如丛书信息、标签或出版商。
  • 将电子书元数据与基于 Web 的证据源同步,以提高书库的可检索性。
  • 使用特定语言的读取脚本自动清理作者和标题排序字段。
  • 当缺少标准元数据时,从 PDF 文档的第一页或最后一页提取元数据。
  • 通过安全的、由环境变量驱动的身份验证过程管理远程 Calibre 库。
Calibre 元数据应用 工作原理
  1. 目标识别:该技能执行只读查找以缩小候选图书范围,并需要用户确认目标 ID。
  2. 元数据综合:它从文件提取(PDF/Epub)和网络资源中收集证据,以提出元数据改进建议。
  3. 方案批准:用户查看建议表,并可以在暂存任何更改之前批准、拒绝或编辑特定字段。
  4. 模拟运行验证:使用 calibredb 执行强制性的模拟运行,以验证生成的命令并预览结果。
  5. 远程应用:在获得明确的最终批准后,该技能将更新应用于 Calibre 内容服务器并报告最终值。

Calibre 元数据应用 配置指南

要在 Openclaw Skills 环境中使用此技能,请确保您的 PATH 中包含 calibredbnode

  1. 安装依赖技能:
# 确保 subagent-spawn-command-builder 已激活
  1. 配置您的环境变量以实现安全访问:
export CALIBRE_PASSWORD='your_secret_password'
export CALIBRE_USERNAME='your_username'
  1. 执行模拟运行以测试连接性:
cat changes.jsonl | node skills/calibre-metadata-apply/scripts/calibredb_apply.mjs --with-library "http://HOST:PORT/#LIB_ID" --password-env CALIBRE_PASSWORD

Calibre 元数据应用 数据架构与分类体系

该技能通过结构化文件系统管理状态和配置,以确保跨会话的持久性。

数据类型 位置 用途
运行状态 skills/calibre-metadata-apply/state/runs.json 跟踪活动的元数据更新作业和子代理任务。
认证缓存 ~/.config/calibre-metadata-apply/auth.json 存储可选的会话身份验证数据。
配置 ~/.config/calibre-metadata-apply/config.json 持久化用户偏好,例如非拉丁字符的 reading_script
name: calibre-metadata-apply
description: Apply metadata updates to existing Calibre books via calibredb over a Content server. Use for controlled metadata edits after target IDs are confirmed by a read-only lookup.
metadata: {"openclaw":{"requires":{"bins":["node","calibredb"],"env":["CALIBRE_PASSWORD"]},"optionalBins":["pdffonts"],"optionalEnv":["CALIBRE_USERNAME"],"primaryEnv":"CALIBRE_PASSWORD","dependsOnSkills":["subagent-spawn-command-builder"],"localWrites":["skills/calibre-metadata-apply/state/runs.json","~/.config/calibre-metadata-apply/auth.json","~/.config/calibre-metadata-apply/config.json"],"modifiesRemoteData":["calibre:metadata"]}}

calibre-metadata-apply

A skill for updating metadata of existing Calibre books.

Requirements

  • calibredb must be available on PATH in the runtime environment
  • subagent-spawn-command-builder installed (for spawn payload generation)
  • pdffonts is optional/recommended for PDF evidence checks
  • Reachable Calibre Content server URL
    • http://HOST:PORT/#LIBRARY_ID
  • If authentication is enabled, prefer /home/altair/.openclaw/.env:
    • CALIBRE_USERNAME=
    • CALIBRE_PASSWORD=
  • Pass --password-env CALIBRE_PASSWORD (username auto-loads from env)
  • You can still override explicitly with --username .
  • Optional auth cache: --save-auth (default file: ~/.config/calibre-metadata-apply/auth.json)

Supported fields

Direct fields (set_metadata --field)

  • title
  • title_sort
  • authors (string with & or array)
  • author_sort
  • series
  • series_index
  • tags (string or array)
  • publisher
  • pubdate (YYYY-MM-DD)
  • languages
  • comments

Helper fields

  • comments_html (OC marker block upsert)
  • analysis (auto-generates analysis HTML for comments)
  • analysis_tags (adds tags)
  • tags_merge (default true)
  • tags_remove (remove specific tags after merge)

Required execution flow

A. Target confirmation (mandatory)

  1. Run read-only lookup to narrow candidates
  2. Show id,title,authors,series,series_index
  3. Get user confirmation for final target IDs
  4. Build JSONL using only confirmed IDs

B. Proposal synthesis (when metadata is missing)

  1. Collect evidence from file extraction + web sources
  2. Show one merged proposal table with:
    • candidate, source, confidence (high|medium|low)
    • title_sort_candidate, author_sort_candidate
  3. Get user decision:
    • approve all
    • approve only:
    • reject:
    • edit: =
  4. Apply only approved/finalized fields
  5. If confidence is low or sources conflict, keep fields empty

C. Apply

  1. Run dry-run first (mandatory)
  2. Run --apply only after explicit user approval
  3. Re-read and report final values

Analysis worker policy

  • Use subagent-spawn-command-builder to generate sessions_spawn payload for heavy candidate generation
    • task is required.
    • Profile should include model/thinking/timeout/cleanup for this workflow.
  • Use lightweight subagent model for analysis (avoid main heavy model)
  • Keep final decisions + dry-run/apply in main

Data flow disclosure

  • Local execution:
    • Build calibredb set_metadata commands from JSONL.
    • Read/write local state files (state/runs.json) and optional auth/config files under ~/.config/calibre-metadata-apply/.
  • Subagent execution (optional for heavy candidate generation):
    • Uses sessions_spawn via subagent-spawn-command-builder.
    • Text/metadata sent to subagent can reach model endpoints configured by runtime profile.
  • Remote write:
    • calibredb set_metadata updates metadata on the target Calibre Content server.

Security rules:

  • Do not use --save-plain-password unless explicitly instructed by the user.
  • Prefer env-based password (--password-env CALIBRE_PASSWORD) over inline --password.
  • If user does not want external model/subagent processing, keep flow local and skip subagent orchestration.

Long-run turn-split policy (library-wide)

For library-wide heavy processing, always use turn-split execution.

Unknown-document recovery flow (M3)

Batch sizing rule:

  • Keep each unknown-document batch small enough to show full row-by-row results in chat (no representative sampling).
  • If unresolved items remain, stop and wait for explicit user instruction to start the next batch.

User intervention checkpoints (fixed)

  1. Light pass (metadata-only)

    • Always run this stage by default (no extra user instruction required)
    • Analyze existing metadata only (no file content read)
    • Present a table to user:
      • current file/title
      • recommended title/metadata
      • confidence/evidence summary
    • Stop and wait for user instruction before any deeper stage
  2. On user request: page-1 pass

    • Read only the first page and refine proposals
    • Report delta from light pass
  3. If still uncertain: deep pass

    • Read first 5 pages + last 5 pages
    • Add web evidence search
    • Produce finalized proposal with confidence + rationale
  4. Approval gate

    • Show detailed findings and request explicit approval before apply

Pending and unsupported handling

  • Use pending-review tag for unresolved/hold items.
  • If document is unresolved in current flow, do not force metadata guesses.
    • Tag with pending-review and keep for follow-up investigation.

Diff report format (for unknown batch runs)

Return full results (not samples):

  • execution summary (target/changed/pending/skipped/error)
  • full changed list with id + key before/after fields
  • full pending list with id + reason
  • full error list with id + error summary
  • confidence must be expressed as high|medium|low

Runtime artifact policy

  • Keep run-state and temporary artifacts only while a run is active.
  • On successful completion, remove per-run state/artifacts.
  • On failure, keep minimal artifacts only for retry/debug, then clean up after resolution.
  • Use lightweight subagent for all analysis stages
  • Keep apply decisions in main session
  • Persist run state for each stage in state/runs.json

Turn 1 (start)

  1. Main defines scope
  2. Main generates spawn payload via subagent-spawn-command-builder (profile example: calibre-meta), then calls sessions_spawn
  3. Save run_id/session_key/task via scripts/run_state.mjs upsert
  4. Immediately tell the user this is a subagent job and state the execution model used for analysis
  5. Reply with "analysis started" and keep normal chat responsive

Turn 2 (completion)

  1. Receive subagent completion notice
  2. Save result JSON
  3. Complete state handling via scripts/handle_completion.mjs --run-id ... --result-json ...
  4. Return summarized proposal (apply only when needed)

Run state file:

  • state/runs.json

PDF extraction policy

  1. Try ebook-convert first
  2. If empty/failed, fallback to pdftotext
  3. If both fail, switch to web-evidence-first mode

Sort reading policy

  • Use user-configured reading_script for Japanese/non-Latin sort fields
    • katakana / hiragana / latin
  • Ask once on first use, then persist and reuse
  • Default policy is full reading (no truncation)
  • Config path: ~/.config/calibre-metadata-apply/config.json
    • key: reading_script

Usage

Dry-run:

cat changes.jsonl | node skills/calibre-metadata-apply/scripts/calibredb_apply.mjs r
  --with-library "http://127.0.0.1:8080/#MyLibrary" r
  --password-env CALIBRE_PASSWORD r
  --lang ja

Apply:

cat changes.jsonl | node skills/calibre-metadata-apply/scripts/calibredb_apply.mjs r
  --with-library "http://127.0.0.1:8080/#MyLibrary" r
  --password-env CALIBRE_PASSWORD r
  --apply

Do not

  • Do not run direct --apply using ambiguous title matches only
  • Do not include unconfirmed IDs in apply payload
  • Do not auto-fill low-confidence candidates without explicit confirmation