Gettr 语音转文字与摘要:AI 音频处理 - Openclaw Skills

作者:互联网

2026-04-16

AI教程

什么是 Gettr 语音转文字与摘要 (MLX Whisper)?

此技能在 GETTR 社交媒体内容与可操作的文本数据之间架起了一座高性能桥梁。通过利用 MLX Whisper,它在 Apple Silicon 硬件上执行闪电般的本地转录,确保用户隐私并消除云处理成本。作为 Openclaw Skills 生态系统的一部分,它自动执行从静态帖子和动态直播中提取媒体 URL 的复杂过程。

该工具专为需要快速消化长视频内容的用户设计。它将原始音频转换为格式化的转录文本和摘要,支持多种语言和各种输出长度。无论您是研究人员、记者还是普通用户,此技能都能通过提供结构化、带时间戳的文档来简化社交媒体广播的内容获取。

下载入口:https://github.com/openclaw/skills/tree/main/skills/kevin37li/gettr-transcribe-summarize

安装与下载

1. ClawHub CLI

从源直接安装技能的最快方式。

npx clawhub@latest install gettr-transcribe-summarize

2. 手动安装

将技能文件夹复制到以下位置之一

全局模式 ~/.openclaw/skills/ 工作区 /skills/

优先级:工作区 > 本地 > 内置

3. 提示词安装

将此提示词复制到 OpenClaw 即可自动安装。

请帮我使用 Clawhub 安装 gettr-transcribe-summarize。如果尚未安装 Clawhub,请先安装(npm i -g clawhub)。

Gettr 语音转文字与摘要 (MLX Whisper) 应用场景

  • 为新闻研究创建 GETTR 直播的文本存档。
  • 生成z治评论视频的快速摘要以节省时间。
  • 为长篇社交媒体广播构建带时间戳的大纲以改进导航。
  • 在流水线执行期间利用特定语言代码转录非英语 GETTR 内容。
Gettr 语音转文字与摘要 (MLX Whisper) 工作原理
  1. 用户提供 GETTR URL,该技能解析该 URL 以识别唯一的内容标识符。
  2. 该技能使用专用 Python 脚本或针对签名 URL 的浏览器自动化提取直接视频或 HLS 流 URL。
  3. FFmpeg 下载源媒体并将音频轨道转换为 16kHz 单声道 WAV 文件。
  4. MLX Whisper 处理本地音频文件,生成带有嵌入时间戳的高精度 VTT 转录文本。
  5. 根据用户偏好的详细程度(简短、中等或详细),将转录文本分析并合成为 Markdown 摘要。

Gettr 语音转文字与摘要 (MLX Whisper) 配置指南

要在您的 Openclaw Skills 环境中使用此功能,请确保您的 Apple Silicon Mac 上安装了必要的依赖项:

# 通过 Homebrew 安装 FFmpeg
brew install ffmpeg

# 通过 Pip 安装 MLX Whisper
pip install mlx-whisper

安装完成后,您可以手动或通过代理使用提供的 bash 脚本触发流水线。

Gettr 语音转文字与摘要 (MLX Whisper) 数据架构与分类体系

该技能为每个处理过的视频维护清晰的输出层级,确保数据按其唯一的 GETTR 标识符进行组织:

文件名 格式 描述
audio.wav 音频 (WAV) 提取的 16kHz 单声道音频源。
audio.vtt 文本 (VTT) 包含 WebVTT 时间戳的原始转录文本。
summary.md Markdown 包含要点和/或带时间戳大纲的最终交付物。
name: gettr-transcribe-summarize
description: Download audio from a GETTR post (via HTML og:video), transcribe it locally with MLX Whisper on Apple Silicon (with timestamps via VTT), and summarize the transcript into bullet points and/or a timestamped outline. Use when given a GETTR post URL and asked to produce a transcript or summary.
homepage: https://gettr.com
metadata: {"clawdbot":{"emoji":"??","requires":{"bins":["mlx_whisper","ffmpeg"]},"install":[{"id":"mlx-whisper","kind":"pip","package":"mlx-whisper","bins":["mlx_whisper"],"label":"Install mlx-whisper (pip)"},{"id":"ffmpeg","kind":"brew","formula":"ffmpeg","bins":["ffmpeg"],"label":"Install ffmpeg (brew)"}]}}

Gettr Transcribe + Summarize (MLX Whisper)

Quick start

# 1. Parse the slug from the URL (just read it — no script needed)
#    https://gettr.com/post/p1abc2def  → slug = p1abc2def
#    https://gettr.com/streaming/p3xyz → slug = p3xyz

# 2. Get the video URL
#    For /post/ URLs: use the extraction script
python3 scripts/extract_gettr_og_video.py ""

#    For /streaming/ URLs: use browser automation directly (extraction script is unreliable)
#    See Step 1 below for browser automation instructions

# 3. Run download + transcription pipeline
bash scripts/run_pipeline.sh "" ""

To explicitly set the transcription language (recommended for non-English content):

bash scripts/run_pipeline.sh --language zh "" ""

Common language codes: zh (Chinese), en (English), ja (Japanese), ko (Korean), es (Spanish), fr (French), de (German), ru (Russian).

This outputs:

  • ./out/gettr-transcribe-summarize//audio.wav
  • ./out/gettr-transcribe-summarize//audio.vtt

Then proceed to Step 3 (Summarize) to generate the final deliverable.


Workflow (GETTR URL → transcript → summary)

Inputs to confirm

Ask for:

  • GETTR post URL
  • Output format: bullets only or bullets + timestamped outline
  • Summary size: short, medium (default), or detailed
  • Language (optional): if the video is non-English and auto-detection fails, ask for the language code (e.g., zh for Chinese)

Notes:

  • This skill does not handle authentication-gated GETTR posts.
  • This skill does not translate; outputs stay in the video's original language.
  • If transcription quality is poor or mixed with English, re-run with explicit --language flag.

Prereqs (local)

  • mlx_whisper installed and on PATH
  • ffmpeg installed (recommended: brew install ffmpeg)

Step 0 — Parse the slug and pick an output directory

Parse the slug directly from the GETTR URL — just read the last path segment, no script needed:

  • https://gettr.com/post/p1abc2def → slug = p1abc2def
  • https://gettr.com/streaming/p3xyz789 → slug = p3xyz789

Output directory: ./out/gettr-transcribe-summarize//

Directory structure:

  • ./out/gettr-transcribe-summarize//audio.wav
  • ./out/gettr-transcribe-summarize//audio.vtt
  • ./out/gettr-transcribe-summarize//summary.md

Step 1 — Get the video URL

The approach depends on the URL type:

For /post/ URLs — Use the extraction script

Run the extraction script to get the video URL from the post HTML:

python3 scripts/extract_gettr_og_video.py ""

This prints the best candidate video URL (often an HLS .m3u8) to stdout.

If extraction fails, ask the user to provide the .m3u8/MP4 URL directly (common if the post is private/gated or the HTML is dynamic).

For /streaming/ URLs — Use browser automation directly

Do not use the extraction script for streaming URLs. The og:video URL from static HTML extraction is unreliable for streaming content — it either fails outright or the download stalls and fails near the end.

Instead, use browser automation to get a fresh, dynamically-signed URL:

  1. Open the GETTR streaming URL and wait for the page to fully load (JavaScript must execute)
  2. Extract the og:video meta tag content from the rendered DOM:
    document.querySelector('meta[property="og:video"]').getAttribute('content')
    
  3. Use that fresh URL for the pipeline in Step 2

If browser automation is not available or fails, see references/troubleshooting.md for how to guide the user to manually extract the fresh URL from their browser.

Step 2 — Run the pipeline (download + transcribe)

Feed the extracted video URL and slug into the pipeline:

bash scripts/run_pipeline.sh "" ""

To explicitly set the language (recommended when auto-detection fails):

bash scripts/run_pipeline.sh --language zh "" ""

The pipeline does two things:

  1. Downloads audio as 16kHz mono WAV via ffmpeg
  2. Transcribes with MLX Whisper, outputting VTT with timestamps

If the pipeline fails with HTTP 412 (stale signed URL)

This error occurs with /streaming/ URLs when the signed URL has expired. If browser automation returned a stale URL, retry by re-running browser automation to get a fresh URL, then retry the pipeline.

If browser automation is not available or fails, see references/troubleshooting.md for how to guide the user to manually extract the fresh URL from their browser.

Notes:

  • By default, language is auto-detected. For non-English content where detection fails, use --language.
  • If too slow or memory-heavy, try smaller models: mlx-community/whisper-medium or mlx-community/whisper-small.
  • If quality is poor, try the full model: mlx-community/whisper-large-v3 (slower but more accurate).
  • If --word-timestamps causes issues, the pipeline retries automatically without it.

Step 3 — Summarize

Write the final deliverable to ./out/gettr-transcribe-summarize//summary.md.

Pick a summary size (user-selectable):

  • Short: 5–8 bullets; (if outline) 4–6 sections
  • Medium (default): 8–20 bullets; (if outline) 6–15 sections
  • Detailed: 20–40 bullets; (if outline) 15–30 sections

Include:

  • Bullets (per size above)
  • Optional timestamped outline (per size above)

Timestamped outline format (default heading style):

[00:00 - 02:15] Section heading
- 1–3 sub-bullets

When building the outline from VTT cues:

  • Group adjacent cues into coherent sections.
  • Use the start time of the first cue and end time of the last cue in the section.

Bundled scripts

  • scripts/run_pipeline.sh: download + transcription pipeline (takes a video URL and slug)
  • scripts/extract_gettr_og_video.py: fetch GETTR HTML and extract the og:video URL (with retry/backoff)
  • scripts/download_audio.sh: download/extract audio from HLS or MP4 URL to 16kHz mono WAV

Error handling

  • Non-video posts: The extraction script detects image/text posts and provides a helpful error message.
  • Network errors: Automatic retry with exponential backoff (up to 3 attempts).
  • No audio track: The download script validates output and reports if the source has no audio.
  • HTTP 412 errors: Occurs with /streaming/ URLs when the signed URL has expired. Re-run browser automation to get a fresh URL (see Step 1); if that fails, see references/troubleshooting.md.

Troubleshooting

See references/troubleshooting.md for detailed solutions to common issues including:

  • HTTP 412 errors (stale signed URLs)
  • Extraction failures
  • Download errors
  • Transcription quality issues

相关推荐