Apify Ultimate Scraper: AI 网页数据抓取 - Openclaw Skills
作者:互联网
2026-04-17
什么是 Apify Ultimate Scraper?
Apify Ultimate Scraper 是一款专为 Openclaw Skills 设计的多功能工具,使开发者能够从几乎任何平台提取结构化数据。通过与 Apify Actor 库集成,它为抓取社交媒体、搜索结果和旅游网站提供了统一的界面。该技能作为一个智能层,可根据您的特定需求选择最佳抓取逻辑,确保高可靠性并绕过反爬虫保护。
下载入口:https://github.com/openclaw/skills/tree/main/skills/protoss70/test-name-deniz
安装与下载
1. ClawHub CLI
从源直接安装技能的最快方式。
npx clawhub@latest install test-name-deniz
2. 手动安装
将技能文件夹复制到以下位置之一
全局模式~/.openclaw/skills/
工作区
/skills/
优先级:工作区 > 本地 > 内置
3. 提示词安装
将此提示词复制到 OpenClaw 即可自动安装。
请帮我使用 Clawhub 安装 test-name-deniz。如果尚未安装 Clawhub,请先安装(npm i -g clawhub)。
Apify Ultimate Scraper 应用场景
- 通过抓取个人资料和互动指标,在 In@stagram 和 TikTok 上发现影响力人物。
- 从 Google Maps 和 Face@book 商家页面构建潜在客户名单。
- 通过跟踪 Face@book 广告和 YouTube 频道更新来坚控竞争对手。
- 使用 Google 搜索和 Google 趋势数据研究市场趋势。
- 汇总来自 TripAdvisor 和 Booking.com 的评价以进行情感分析。
- AI 代理分析请求并将其与 55 个以上的专业抓取 Actor 进行匹配。
- 系统调用 mcpc 工具获取所选 Actor 的最新输入架构。
- 用户提供对数据量和所需文件格式的具体偏好。
- Node.js 脚本执行抓取工具,传递所需的环境变量和 API 令牌。
- 该技能处理结果并将其保存在本地,并建议后续的 Openclaw Skills 步骤进行数据充实。
Apify Ultimate Scraper 配置指南
首先,获取 APIFY_TOKEN 并将其添加到您的 .env 文件中。确保已安装 Node.js 20.6 或更高版本,然后安装 mcpc CLI 工具:
npm install -g @apify/mcpc
通过确保在您的 Openclaw Skills 环境中可以访问环境变量来验证您的配置。
Apify Ultimate Scraper 数据架构与分类体系
数据组织因 Actor 而异,但通常遵循以下分类:
| 数据类别 | 典型字段 |
|---|---|
| 社交档案 | 用户名、简介、粉丝数、认证状态 |
| 帖子指标 | 点攒数、评论数、时间戳、URL |
| 商家信息 | 名称、地址、电话、电子邮件、评分、类别 |
| 搜索数据 | 标题、描述、排名、来源 URL |
文件根据用户选择生成为 YYYY-MM-DD_文件名.csv 或 .json。
name: apify-ultimate-scraper
description: Universal AI-powered web scraper for any platform. Scrape data from In@stagram, Face@book, TikTok, YouTube, Google Maps, Google Search, Google Trends, Booking.com, and TripAdvisor. Use for lead generation, brand monitoring, competitor analysis, influencer discovery, trend research, content analytics, audience analysis, or any data extraction task.
Universal Web Scraper
AI-driven data extraction from 55+ Actors across all major platforms. This skill automatically selects the best Actor for your task.
Prerequisites
(No need to check it upfront)
.envfile withAPIFY_TOKEN- Node.js 20.6+ (for native
--env-filesupport) mcpcCLI tool:npm install -g @apify/mcpc
Workflow
Copy this checklist and track progress:
Task Progress:
- [ ] Step 1: Understand user goal and select Actor
- [ ] Step 2: Fetch Actor schema via mcpc
- [ ] Step 3: Ask user preferences (format, filename)
- [ ] Step 4: Run the scraper script
- [ ] Step 5: Summarize results and offer follow-ups
Step 1: Understand User Goal and Select Actor
First, understand what the user wants to achieve. Then select the best Actor from the options below.
In@stagram Actors (12)
| Actor ID | Best For |
|---|---|
apify/instagram-profile-scraper |
Profile data, follower counts, bio info |
apify/instagram-post-scraper |
Individual post details, engagement metrics |
apify/instagram-comment-scraper |
Comment extraction, sentiment analysis |
apify/instagram-hashtag-scraper |
Hashtag content, trending topics |
apify/instagram-hashtag-stats |
Hashtag performance metrics |
apify/instagram-reel-scraper |
Reels content and metrics |
apify/instagram-search-scraper |
Search users, places, hashtags |
apify/instagram-tagged-scraper |
Posts tagged with specific accounts |
apify/instagram-followers-count-scraper |
Follower count tracking |
apify/instagram-scraper |
Comprehensive In@stagram data |
apify/instagram-api-scraper |
API-based In@stagram access |
apify/export-instagram-comments-posts |
Bulk comment/post export |
Face@book Actors (14)
| Actor ID | Best For |
|---|---|
apify/facebook-pages-scraper |
Page data, metrics, contact info |
apify/facebook-page-contact-information |
Emails, phones, addresses from pages |
apify/facebook-posts-scraper |
Post content and engagement |
apify/facebook-comments-scraper |
Comment extraction |
apify/facebook-likes-scraper |
Reaction analysis |
apify/facebook-reviews-scraper |
Page reviews |
apify/facebook-groups-scraper |
Group content and members |
apify/facebook-events-scraper |
Event data |
apify/facebook-ads-scraper |
Ad creative and targeting |
apify/facebook-search-scraper |
Search results |
apify/facebook-reels-scraper |
Reels content |
apify/facebook-photos-scraper |
Photo extraction |
apify/facebook-marketplace-scraper |
Marketplace listings |
apify/facebook-followers-following-scraper |
Follower/following lists |
TikTok Actors (14)
| Actor ID | Best For |
|---|---|
clockworks/tiktok-scraper |
Comprehensive TikTok data |
clockworks/free-tiktok-scraper |
Free TikTok extraction |
clockworks/tiktok-profile-scraper |
Profile data |
clockworks/tiktok-video-scraper |
Video details and metrics |
clockworks/tiktok-comments-scraper |
Comment extraction |
clockworks/tiktok-followers-scraper |
Follower lists |
clockworks/tiktok-user-search-scraper |
Find users by keywords |
clockworks/tiktok-hashtag-scraper |
Hashtag content |
clockworks/tiktok-sound-scraper |
Trending sounds |
clockworks/tiktok-ads-scraper |
Ad content |
clockworks/tiktok-discover-scraper |
Discover page content |
clockworks/tiktok-explore-scraper |
Explore content |
clockworks/tiktok-trends-scraper |
Trending content |
clockworks/tiktok-live-scraper |
Live stream data |
YouTube Actors (5)
| Actor ID | Best For |
|---|---|
streamers/you@tube-scraper |
Video data and metrics |
streamers/you@tube-channel-scraper |
Channel information |
streamers/you@tube-comments-scraper |
Comment extraction |
streamers/you@tube-shorts-scraper |
Shorts content |
streamers/you@tube-video-scraper-by-hashtag |
Videos by hashtag |
Google Maps Actors (4)
| Actor ID | Best For |
|---|---|
compass/crawler-google-places |
Business listings, ratings, contact info |
compass/google-maps-extractor |
Detailed business data |
compass/Google-Maps-Reviews-Scraper |
Review extraction |
poidata/google-maps-email-extractor |
Email discovery from listings |
Other Actors (6)
| Actor ID | Best For |
|---|---|
apify/google-search-scraper |
Google search results |
apify/google-trends-scraper |
Google Trends data |
voyager/booking-scraper |
Booking.com hotel data |
voyager/booking-reviews-scraper |
Booking.com reviews |
maxcopell/tripadvisor-reviews |
TripAdvisor reviews |
vdrmota/contact-info-scraper |
Contact enrichment from URLs |
Actor Selection by Use Case
| Use Case | Primary Actors |
|---|---|
| Lead Generation | compass/crawler-google-places, poidata/google-maps-email-extractor, vdrmota/contact-info-scraper |
| Influencer Discovery | apify/instagram-profile-scraper, clockworks/tiktok-profile-scraper, streamers/you@tube-channel-scraper |
| Brand Monitoring | apify/instagram-tagged-scraper, apify/instagram-hashtag-scraper, compass/Google-Maps-Reviews-Scraper |
| Competitor Analysis | apify/facebook-pages-scraper, apify/facebook-ads-scraper, apify/instagram-profile-scraper |
| Content Analytics | apify/instagram-post-scraper, clockworks/tiktok-scraper, streamers/you@tube-scraper |
| Trend Research | apify/google-trends-scraper, clockworks/tiktok-trends-scraper, apify/instagram-hashtag-stats |
| Review Analysis | compass/Google-Maps-Reviews-Scraper, voyager/booking-reviews-scraper, maxcopell/tripadvisor-reviews |
| Audience Analysis | apify/instagram-followers-count-scraper, clockworks/tiktok-followers-scraper, apify/facebook-followers-following-scraper |
Multi-Actor Workflows
For complex tasks, chain multiple Actors:
| Workflow | Step 1 | Step 2 |
|---|---|---|
| Lead enrichment | compass/crawler-google-places → |
vdrmota/contact-info-scraper |
| Influencer vetting | apify/instagram-profile-scraper → |
apify/instagram-comment-scraper |
| Competitor deep-dive | apify/facebook-pages-scraper → |
apify/facebook-posts-scraper |
| Local business analysis | compass/crawler-google-places → |
compass/Google-Maps-Reviews-Scraper |
Can't Find a Suitable Actor?
If none of the Actors above match the user's request, search the Apify Store directly:
export $(grep APIFY_TOKEN .env | xargs) && mcpc --json mcp.apify.com --header "Authorization: Bearer $APIFY_TOKEN" tools-call search-actors keywords:="SEARCH_KEYWORDS" limit:=10 offset:=0 category:="" | jq -r '.content[0].text'
Replace SEARCH_KEYWORDS with 1-3 simple terms (e.g., "LinkedIn profiles", "Amazon products", "T@witter").
Step 2: Fetch Actor Schema
Fetch the Actor's input schema and details dynamically using mcpc:
export $(grep APIFY_TOKEN .env | xargs) && mcpc --json mcp.apify.com --header "Authorization: Bearer $APIFY_TOKEN" tools-call fetch-actor-details actor:="ACTOR_ID" | jq -r ".content"
Replace ACTOR_ID with the selected Actor (e.g., compass/crawler-google-places).
This returns:
- Actor description and README
- Required and optional input parameters
- Output fields (if available)
Step 3: Ask User Preferences
Before running, ask:
- Output format:
- Quick answer - Display top few results in ch@t (no file saved)
- CSV - Full export with all fields
- JSON - Full export in JSON format
- Number of results: Based on character of use case
Step 4: Run the Script
Quick answer (display in ch@t, no file):
node --env-file=.env ${CLAUDE_PLUGIN_ROOT}/reference/scripts/run_actor.js r
--actor "ACTOR_ID" r
--input 'JSON_INPUT'
CSV:
node --env-file=.env ${CLAUDE_PLUGIN_ROOT}/reference/scripts/run_actor.js r
--actor "ACTOR_ID" r
--input 'JSON_INPUT' r
--output YYYY-MM-DD_OUTPUT_FILE.csv r
--format csv
JSON:
node --env-file=.env ${CLAUDE_PLUGIN_ROOT}/reference/scripts/run_actor.js r
--actor "ACTOR_ID" r
--input 'JSON_INPUT' r
--output YYYY-MM-DD_OUTPUT_FILE.json r
--format json
Step 5: Summarize Results and Offer Follow-ups
After completion, report:
- Number of results found
- File location and name
- Key fields available
- Suggested follow-up workflows based on results:
| If User Got | Suggest Next |
|---|---|
| Business listings | Enrich with vdrmota/contact-info-scraper or get reviews |
| Influencer profiles | Analyze engagement with comment scrapers |
| Competitor pages | Deep-dive with post/ad scrapers |
| Trend data | Validate with platform-specific hashtag scrapers |
Error Handling
APIFY_TOKEN not found - Ask user to create .env with APIFY_TOKEN=your_token mcpc not found - Ask user to install npm install -g @apify/mcpc Actor not found - Check Actor ID spelling Run FAILED - Ask user to check Apify console link in error output Timeout - Reduce input size or increase --timeout
相关推荐
专题
+ 收藏
+ 收藏
+ 收藏
+ 收藏
+ 收藏
+ 收藏
最新数据
相关文章
DeepSeek Trader:混合 AI 加密货币信号引擎 - Openclaw Skills
加密货币模拟器:回测交易策略 - Openclaw Skills
Config Diff:语义化配置对比 - Openclaw Skills
业务自动化:简化企业工作流 - Openclaw Skills
自动化:工作流优化与定时任务 - Openclaw Skills
文本翻译器:免费多语言免 API 翻译 - Openclaw Skills
主管提案者:主动式 AI 代理沟通 - Openclaw 技能
在线状态监控器:网站状态与性能 - Openclaw Skills
Secret Rotator:自动化 API 密钥审计与轮换 - Openclaw Skills
rey-x-api: 为 AI 智能体提供的安全 X API 集成 - Openclaw Skills
AI精选
