VibeSurf:高级浏览器自动化与网页抓取 - Openclaw Skills

作者:互联网

2026-04-18

AI教程

什么是 VibeSurf 浏览器自动化?

VibeSurf 是一款强大的浏览器自动化技能,旨在作为 AI 智能体与实时网页之间的主要接口。通过提供用于控制真实浏览器的结构化 API,VibeSurf 允许智能体执行从简单的内容获取到复杂的、多步骤的自动化工作流的所有操作。对于使用 Openclaw Skills 构建的开发者来说,该工具是基石,使他们的智能体能像人类一样与网站交互,但具备自动化系统的速度和精度。

该技能建立在委托模型之上,主入口点将请求路由到专门的参考指南,以执行财务数据提取、社交媒体交互或精确的 DOM 操作等任务。无论您是在构建研究助手还是自动化测试机器人,VibeSurf 都能提供必要的基础设施来处理现代网页的复杂性。

下载入口:https://github.com/openclaw/skills/tree/main/skills/vvincent1234/vibesurf

安装与下载

1. ClawHub CLI

从源直接安装技能的最快方式。

npx clawhub@latest install vibesurf

2. 手动安装

将技能文件夹复制到以下位置之一

全局模式 ~/.openclaw/skills/ 工作区 /skills/

优先级:工作区 > 本地 > 内置

3. 提示词安装

将此提示词复制到 OpenClaw 即可自动安装。

请帮我使用 Clawhub 安装 vibesurf。如果尚未安装 Clawhub,请先安装(npm i -g clawhub)。

VibeSurf 浏览器自动化 应用场景

  • 从电子商务和新闻网站自动提取价格、产品列表或表格数据。
  • 执行复杂的基于 Web 的任务,如填写多页表单或导航经过身份验证的门户网站。
  • 执行实时 AI 驱动的网页搜索,以最新信息补充 LLM 知识。
  • 通过自动截图和页面摘要生成视觉文档。
  • 通过 Composio 和 MCP 将基于浏览器的流程与 Gmail、GitHub 和 Slack 等外部工具集成。
VibeSurf 浏览器自动化 工作原理
  1. 健康验证:智能体首先通过 /health 端点检查 VibeSurf 服务状态,以确保自动化引擎处于活动状态。
  2. 动作发现:智能体查询 /api/tool/search 端点,以识别与用户请求最相关的动作。
  3. 架构验证:在执行之前,智能体检索所选动作参数的特定 JSON 架构,以防止请求格式错误。
  4. 任务执行:智能体向 /api/tool/execute 发送带有验证参数的 POST 请求,以触发浏览器动作。
  5. 状态坚控:对于复杂任务,智能体使用 get_browser_state 持续坚控浏览器状态,以决定随后的导航或交互步骤。

VibeSurf 浏览器自动化 配置指南

要在 Openclaw Skills 生态系统中使用 VibeSurf,请确保服务在本地运行且端点可访问。

设置环境变量:

export VIBESURF_ENDPOINT="http://127.0.0.1:9335"

验证连接:

curl $VIBESURF_ENDPOINT/health

验证后,您可以开始进行 API 调用以搜索动作或通过您的智能体直接执行浏览器任务。

VibeSurf 浏览器自动化 数据架构与分类体系

VibeSurf 通过一系列处理发现、配置和执行的结构化 API 端点来组织其功能。

组件 端点 描述
动作发现 /api/tool/search 根据关键词返回可用自动化动作列表。
参数架构 /api/tool/{name}/params 为特定动作提供所需的 JSON 结构。
执行引擎 /api/tool/execute 运行浏览器命令的主要 POST 端点。
配置 /api/config/* 管理 LLM 配置文件、MCP 服务器集成和调度。
文件管理 /api/files/* 处理浏览器会话期间生成的上传和下载。
name: vibesurf
description: Use when user asks to browse websites, automate browser tasks, fill forms, extract webpage data, search web information, or interact with external apps. This is the main entry point that delegates to detailed reference guides.
homepage: https://github.com/vibesurf-ai/VibeSurf
metadata:
  moltbot:
    requires:
      env: ["VIBESURF_ENDPOINT"]
    primaryEnv: "VIBESURF_ENDPOINT"

VibeSurf - Browser Automation

Control real browsers through VibeSurf. This skill delegates to detailed reference guides.

?? VIBESURF STATUS

Check if VibeSurf is running:

curl $VIBESURF_ENDPOINT/health
  • ? HTTP 200 → Proceed with vibesurf skills
  • ? Connection refused → Ask user to run vibesurf (NEVER run it yourself)

Default endpoint: http://127.0.0.1:9335

How to Call VibeSurf API

VibeSurf exposes three core HTTP endpoints:

1. List Available Actions

GET $VIBESURF_ENDPOINT/api/tool/search?keyword={optional_keyword}

Returns all available VibeSurf actions.

2. Get Action Parameters

GET $VIBESURF_ENDPOINT/api/tool/{action_name}/params

Returns JSON schema for the action's parameters.

3. Execute Action

POST $VIBESURF_ENDPOINT/api/tool/execute
Content-Type: application/json

{
  "action_name": "action_name_here",
  "parameters": {
    // action-specific parameters
  }
}

Workflow:

  1. Search for action → Get action name
  2. Get params schema → See required/optional parameters
  3. Execute → Call with parameters

?? Parameter Error Handling

ALWAYS call GET /api/tool/{action_name}/params before executing ANY action if you are unsure about parameters.


Which Reference to Read

Task Type Read Reference Action Name
AI web search references/search.md skill_search
Fetch URL content as markdown references/fetch.md skill_fetch
Extract lists/tables references/js_code.md skill_code
Extract page content references/crawl.md skill_crawl
Summarize page references/summary.md skill_summary
Stock/financial data references/finance.md skill_finance
Trending news references/trend.md skill_trend
Screenshot references/screenshot.md skill_screenshot
Precise browser control references/browser.md browser.* actions
Task-oriented automation (sub-agent) references/browser-use.md execute_browser_use_agent
Social Media Platform APIs references/website-api.md call_website_api
Pre-built workflows references/workflows.md execute_workflow
Gmail/GitHub/Slack references/integrations.md execute_extra_tool
LLM profile settings references/config-llm.md /api/config/llm-profiles/*
MCP server config references/config-mcp.md /api/config/mcp-profiles/*
VibeSurf key/workflows references/config-vibesurf.md /api/vibesurf/*
Composio key/toolkits references/config-composio.md /api/composio/*
Schedule workflows references/config-schedule.md /api/schedule/*
File upload/download references/file.md /api/files/*
Voice/ASR configuration references/config-voice.md /api/voices/*
Voice/ASR configuration references/config-voice.md /api/voices/*

Configuration References

Config Task Reference When to Use
Add/switch LLM references/config-llm.md Manage AI model profiles (OpenAI, Anthropic, etc.)
Add MCP server references/config-mcp.md Configure MCP integrations for extended tools
VibeSurf API key references/config-vibesurf.md Set up API key, import/export workflows
Enable Gmail/GitHub/etc references/config-composio.md Configure Composio toolkits and OAuth
Schedule workflows references/config-schedule.md Set up cron-based workflow automation
Voice/ASR profiles references/config-voice.md Configure speech recognition profiles

Note: After configuring Composio or MCP tools, use them through the references/integrations.md (see tool naming: cpo.{toolkit}.{action} or mcp.{server}.{action}).


Decision Flow

Browser/Web Task
│
├─ Need to search for information/bug/issue? → Read [references/search.md](references/search.md) [PREFERRED]
│  Examples: "Search for solutions to [bug name]", "Find latest info about [topic]"
│
├─ Need to fetch URL content directly? → Read [references/fetch.md](references/fetch.md)
│  Examples: "Fetch content from [URL]", "Get documentation at [URL]", "Read this webpage"
│
├─ Need to open website? → Read [references/browser.md](references/browser.md)
│  Examples: "Open documentation site", "Go to [URL]", "Check this page"
│
├─ Need to extract data?
│  ├─ Lists/tables/repeated items? → Read [references/js_code.md](references/js_code.md)
│  └─ Main content? → Read [references/crawl.md](references/crawl.md)
│
├─ Need summary? → Read [references/summary.md](references/summary.md)
│
├─ Stock/finance data? → Read [references/finance.md](references/finance.md)
│
├─ Trending news? → Read [references/trend.md](references/trend.md)
│
├─ Screenshot? → Read [references/screenshot.md](references/screenshot.md)
│
├─ Need precise control or step-by-step operations? → Read [references/browser.md](references/browser.md)
│  Examples: "Click the button", "Type in the field", "Scroll down"
│
├─ Complex task-oriented automation? → Read [references/browser-use.md](references/browser-use.md)
│  Examples: "Fill out this form", "Extract data from multiple pages"
│
├─ Platform API (XiaoHongShu/Youtube/etc)? → Read [references/website-api.md](references/website-api.md)
│
├─ External app (Gmail/Google Calendar/GitHub)? → Read [references/integrations.md](references/integrations.md)
│
├─ Pre-built workflow? → Read [references/workflows.md](references/workflows.md)
│
└─ Need to configure LLM/MCP/VibeSurf/Composio/Schedule/Voice? → Read config-* references
   - LLM profiles → [references/config-llm.md](references/config-llm.md)
   - MCP servers → [references/config-mcp.md](references/config-mcp.md)
   - VibeSurf key/workflows → [references/config-vibesurf.md](references/config-vibesurf.md)
   - Composio key/toolkits → [references/config-composio.md](references/config-composio.md)
   - Schedule workflows → [references/config-schedule.md](references/config-schedule.md)
   - Voice/ASR profiles → [references/config-voice.md](references/config-voice.md)

Quick Reference

Goal Read Reference Action
Search web references/search.md skill_search
Fetch URL content references/fetch.md skill_fetch
Extract prices/products references/js_code.md skill_code
Get main content references/crawl.md skill_crawl
Summarize page references/summary.md skill_summary
Stock data references/finance.md skill_finance
Hot topics references/trend.md skill_trend
Take screenshot references/screenshot.md skill_screenshot
Click/navigate/type references/browser.md browser.click, browser.navigate, etc.
Task-oriented automation references/browser-use.md execute_browser_use_agent
Social Media Platform APIs references/website-api.md call_website_api
Send email references/integrations.md execute_extra_tool
Run workflow references/workflows.md execute_workflow
Configure LLM profiles references/config-llm.md /api/config/llm-profiles/*
Configure MCP servers references/config-mcp.md /api/config/mcp-profiles/*
Configure VibeSurf key references/config-vibesurf.md /api/vibesurf/verify-key
Enable Composio toolkits references/config-composio.md /api/composio/toolkits
Schedule workflows references/config-schedule.md /api/schedule/*
Upload/Download files references/file.md /api/files/*
Configure Voice/ASR references/config-voice.md /api/voices/*

Common Patterns

Request Read Reference Action
"Search for X" references/search.md skill_search
"Fetch content from [URL]" references/fetch.md skill_fetch
"Extract all prices" references/js_code.md skill_code
"Summarize this page" references/summary.md skill_summary
"Stock info for AAPL" references/finance.md skill_finance
"What's trending" references/trend.md skill_trend
"Take a screenshot" references/screenshot.md skill_screenshot
"Navigate and click" references/browser.md browser.navigate, browser.click
"Fill out this form" references/browser-use.md or references/browser.md execute_browser_use_agent or manual browser
"Get XiaoHongShu posts" references/website-api.md call_website_api
"Send Gmail" references/integrations.md execute_extra_tool
"Run video download" references/workflows.md execute_workflow
"Configure LLM" references/config-llm.md /api/config/llm-profiles
"Add MCP server" references/config-mcp.md /api/config/mcp-profiles
"Set VibeSurf API key" references/config-vibesurf.md /api/vibesurf/verify-key
"Enable Gmail/GitHub" references/config-composio.md /api/composio/toolkits
"Schedule workflow" references/config-schedule.md /api/schedule/*
"Upload file" / "Download file" references/file.md /api/files/*
"Configure voice profile" / "ASR" references/config-voice.md /api/voices/*
"Speech to text" / "Transcribe audio" references/config-voice.md /api/voices/asr

Error Handling

Error Solution
VibeSurf not running Check status: curl $VIBESURF_ENDPOINT/health
If not running: Inform user to run vibesurf
NEVER run the command yourself
Don't know which reference Read decision tables above
Action not found Call GET /api/tool/search to list all actions
Wrong parameters Call GET /api/tool/{action_name}/params to see schema
browser-use fails or gets stuck Fallback to references/browser.md: use get_browser_statebrowser.{action} → repeat loop
LLM/Crawl/Summary errors Cause: No LLM profile configured
Solution: Read references/config-llm.md to add an LLM profile first
Integration tools empty/not found Cause: Composio/MCP not configured
Solution: Read references/config-composio.md or references/config-mcp.md to enable toolkits first

Getting Browser State

?? Check Current Browser State

When user asks about current page content or browser status (e.g., "What's on the current page?", "What tabs are open?", "What's the browser showing?"), read references/browser.md and use the get_browser_state action.

This is essential when you don't have context about what the user is currently viewing in their browser.


browser vs browser-use

Both can accomplish the same browser tasks - they're complementary:

Approach Best For How It Works
browser-use (references/browser-use.md) Complex, long tasks Task-oriented sub-agent: describe goal + desired output, agent figures out steps
browser (references/browser.md) Precise control Step-by-step manual control: explicit actions with full visibility
Hybrid Best reliability Try browser-use first, fallback to browser if it fails

Fallback pattern when browser-use fails:

browser-use fails or gets stuck
→ Read references/browser.md
→ get_browser_state (inspect page)
→ browser.{action} (perform action)
→ get_browser_state (verify & plan next)
→ repeat until complete

Resources

  • GitHub: https://github.com/vibesurf-ai/VibeSurf
  • Reference Docs: See references/ folder for detailed guides

API Parameter Troubleshooting

If you encounter API parameter errors when calling VibeSurf endpoints, you can visit the interactive API documentation at:

http://127.0.0.1:9335/docs

For example: http://127.0.0.1:9335/docs#/config/create_mcp_profile_api_config_mcp_profiles_post

Note: This is a fallback approach. In most cases, reading the corresponding references/*.md file (e.g., references/config-mcp.md) should provide sufficient guidance on how to use the API correctly. Only refer to the /docs endpoint when the skill documentation doesn't resolve your issue or you need to inspect specific request/response schemas.

相关推荐