Bilibili AI 视频总结与转录 - Openclaw Skills

作者:互联网

2026-03-31

AI教程

什么是 bili-summary?

bili-summary 技能是一款先进的自动化工具,旨在架起 Bilibili 视频内容与可操作文本见解之间的桥梁。通过集成用于媒体获取的 yt-dlp 和用于语音转文本处理的 faster-whisper,它为分析 B 站内容提供了完整的管道。这是任何利用 Openclaw Skills 进行研究或内容消费的工作流程的重要补充。

该技能擅长处理多种视频类型,提供智能回退机制,如果官方字幕不可用,它会自动转录音频。提取文本后,它利用 Google Gemini 2.5 Flash 模型生成结构化摘要,包括章节划分和关键见解,使在 Openclaw Skills 生态系统中消化长视频内容变得比以往任何时候都更加容易。

下载入口:https://github.com/openclaw/skills/tree/main/skills/lava-chen/bili-summary

安装与下载

1. ClawHub CLI

从源直接安装技能的最快方式。

npx clawhub@latest install bili-summary

2. 手动安装

将技能文件夹复制到以下位置之一

全局模式 ~/.openclaw/skills/ 工作区 /skills/

优先级:工作区 > 本地 > 内置

3. 提示词安装

将此提示词复制到 OpenClaw 即可自动安装。

请帮我使用 Clawhub 安装 bili-summary。如果尚未安装 Clawhub,请先安装(npm i -g clawhub)。

bili-summary 应用场景

  • 将长篇 Bilibili 讲座或教程总结为简明扼要的笔记。
  • 从 B 站视频中提取字幕,用于翻译或存档。
  • 为缺乏官方闭路字幕的 Bilibili 视频转录音频。
  • 使用 Openclaw Skills 下载高质量的音频或视频文件供离线参考。
bili-summary 工作原理
  1. 技能首先通过 yt-dlp 获取包括标题、时长和上传者信息在内的视频元数据。
  2. 尝试直接从 Bilibili 平台下载官方 CC 字幕。
  3. 如果未找到字幕,该技能将下载音频流并利用 faster-whisper tiny 模型生成本地转录。
  4. 生成的文本被发送到 Gemini 2.5 Flash API,该 API 返回按章节和核心要点组织的综合摘要。

bili-summary 配置指南

要在您的 Openclaw Skills 环境中开始使用此技能,请按照以下安装步骤操作:

  1. 安装必要的 Python 包:
pip install yt-dlp faster-whisper
  1. 从 Google AI Studio 获取 Google Gemini API 密钥。
  2. 设置您的环境变量以授权 AI 总结:
export GEMINI_API_KEY="your-api-key-here"
  1. 使用以下语法执行完整的总结工作流:
uv run scripts/bili-summary.py "VIDEO_URL" --action summary

bili-summary 数据架构与分类体系

该技能将其输出组织在工作区的临时目录中。下表描述了这些 Openclaw Skills 使用的文件结构:

文件名 用途
audio.m4a 用于转录的提取音轨(临时)。
subtitle.txt 原始文本转录或检索到的字幕。
summary.txt 最终的结构化 AI 摘要内容。
name: bili-summary
description: "Bilibili video download, subtitle extraction, and AI summarization tool. Supports video info, audio download, Whisper transcription, and Gemini-powered detailed summaries. Use when: user asks to download/summarize B站 video, extract subtitles, or transcribe bilibili video."
metadata:{"openclaw":{"emoji": "??","requires": { "bins": ["yt-dlp", "python"] },"install":[{"id": "pip","kind": "pip","package": "yt-dlp faster-whisper","bins": ["yt-dlp"],"label": "Install yt-dlp and faster-whisper",},],},}

bili-summary

Bilibili (B站) video download, subtitle extraction, and AI summarization tool.

When to Use

? USE this skill when:

  • "download Bilibili video"
  • "extract B站 subtitles"
  • "summarize B站 video"
  • "B站视频总结"
  • "bilibili transcription"
  • Any Bilibili video URL

Features

  • Get video info (title, duration, uploader)
  • Download B站 video (video + audio)
  • Extract subtitles (B站 CC subtitles)
  • Audio extraction + Whisper transcription (when no subtitles)
  • Gemini AI detailed summary (chapters + key content + key insights + conclusion)

Prerequisites

1. Install Dependencies

# Using miniconda3 (recommended)
~/miniconda3/bin/pip install yt-dlp faster-whisper

# Or using system Python (may require sudo)
pip install yt-dlp faster-whisper

2. Get Gemini API Key

This skill uses Google Gemini 2.5 Flash for AI summarization.

Steps:

  1. Visit https://aistudio.google.com/app/apikey
  2. Sign in with your Google account
  3. Click "Create API Key"
  4. Copy the generated key

Pricing: Gemini 2.5 Flash has generous free tier (15 RPM, 1M TPM)

3. Set Environment Variable

# Add to your ~/.bashrc or ~/.zshrc for permanent setup
echo 'export GEMINI_API_KEY="your-api-key-here"' >> ~/.bashrc
source ~/.bashrc

# Or set temporarily for current session
export GEMINI_API_KEY="your-api-key-here"

Quick Start

# Download audio, transcribe, and summarize in one command
uv run {baseDir}/scripts/bili-summary.py "https://www.bilibili.com/video/BV1xx411c7mu" --action summary

Other Actions

# Get video info only
uv run {baseDir}/scripts/bili-summary.py "URL" --action info

# Download subtitle (if available)
uv run {baseDir}/scripts/bili-summary.py "URL" --action subtitle

# Download and transcribe audio only
uv run {baseDir}/scripts/bili-summary.py "URL" --action transcribe

# Download full video
uv run {baseDir}/scripts/bili-summary.py "URL" --action video

Options

Option Description Default
url Bilibili video URL (BV号或完整链接) required
--action Operation: info/subtitle/transcribe/video/summary summary
--output Output directory ~/openclaw/workspace/coding-agent/temp/bili-summary

Output Files

Default output: ~/openclaw/workspace/coding-agent/temp/bili-summary/

temp/bili-summary/
├── audio.m4a       # Downloaded audio (deleted after summary)
├── subtitle.txt    # Transcribed text (deleted after summary)
└── summary.txt    # AI summary content

Workflow

summary (full workflow)

  1. Get video info - yt-dlp fetches title, duration, uploader
  2. Try B站 subtitles - Call Bilibili API for CC subtitles
  3. Fallback to Whisper - If no subtitles, download audio + faster-whisper (tiny model) transcription
  4. AI Summary - Call Gemini 2.5 Flash API for detailed summary

Time Estimate

Step Time
Audio download ~15s
Whisper transcription (tiny) ~25s
Gemini summary ~5s
Total ~45s

API Configuration

  • Model: gemini-2.5-flash
  • Endpoint: https://generativelanguage.googleapis.com/v1/models/gemini-2.5-flash:generateContent
  • Free Tier: 15 requests/minute, 1M tokens/minute
  • Sign up: https://aistudio.google.com/app/apikey

Alternative APIs (not implemented)

If you want to use other LLMs:

  • OpenAI GPT-4o - https://api.openai.com/v1/chat/completions
  • Anthropic Claude - https://api.anthropic.com/v1/messages
  • MiniMax - https://api.minimax.chat/v1/text/chatcompletion_v2

Note: Current implementation only supports Gemini. PRs welcome for other providers.

First-Time Setup Guide

Step 1: Install Python dependencies

# Check if miniconda3 exists
ls ~/miniconda3/bin/python

# Install dependencies
~/miniconda3/bin/pip install yt-dlp faster-whisper

# Or use uv
uv pip install yt-dlp faster-whisper

Step 2: Get API Key

  1. Go to https://aistudio.google.com/app/apikey
  2. Create new API key
  3. Copy the key

Step 3: Test the setup

# Set API key
export GEMINI_API_KEY="your-key-here"

# Test with a simple video
uv run {baseDir}/scripts/bili-summary.py "https://www.bilibili.com/video/BV1xx411c7mu" --action info

If you see JSON output with video title, duration, etc., you're ready!

Step 4: Run full summarization

uv run {baseDir}/scripts/bili-summary.py "https://www.bilibili.com/video/BV1xxx" --action summary

Troubleshooting

"No module named 'yt-dlp'"

~/miniconda3/bin/pip install yt-dlp faster-whisper

"GEMINI_API_KEY not found"

# Check if environment variable is set
echo $GEMINI_API_KEY

# Set it
export GEMINI_API_KEY="your-key"

"No subtitles available"

The skill automatically falls back to Whisper transcription. This may take longer but works for any video with audio.

"API rate limit exceeded"

Wait a minute and retry, or check your API quota at https://aistudio.google.com/app/apikey

Security Notes

  • ? API key is read from GEMINI_API_KEY environment variable only
  • ? No hardcoded API keys in source code
  • ? Temporary files stored in workspace temp directory
  • ?? Audio/subtitle files are NOT auto-deleted (manual cleanup required)

File Structure

bili-summary/
├── SKILL.md              # This documentation
├── _meta.json            # ClawHub metadata (auto-generated)
└── scripts/
    └── bili-summary.py   # Main script

License

MIT License - Use at your own risk. Respect Bilibili's terms of service.