Jina AI 技能：网页阅读、搜索与深度搜索

AI智能体脚本智能办公脚本自动化游戏脚本浏览器自动化脚本服务器脚本

Jina AI 技能：网页阅读、搜索与深度搜索 - Openclaw Skills

作者：互联网

2026-03-24

AI教程

什么是 Jina AI 阅读器与搜索？

针对 Openclaw Skills 的 Jina AI 技能为 AI 代理与网页内容交互提供了一个强大的接口。通过利用 Jina 的专业 API，此技能允许代理绕过复杂的 HTML 结构，并从任何 URL（包括重度使用 JavaScript 的网站和 PDF 文档）检索干净的、Markdown 格式的文本。它作为原始网页与大语言模型之间的桥梁，确保以高度易读的格式传递数据。

除了简单的页面读取，此 Openclaw Skills 集成还包括先进的搜索能力和多步研究代理。这些工具使代理能够执行广泛的网页搜索，返回针对 LLM 优化的结果，或进行 DeepSearch（深度搜索），通过结合搜索、阅读和推理来解决需要综合多源信息的复杂查询。

下载入口:https://github.com/openclaw/skills/tree/main/skills/adhishthite/jina-ai

安装与下载

1. ClawHub CLI

从源直接安装技能的最快方式。

npx clawhub@latest install jina-ai

2. 手动安装

将技能文件夹复制到以下位置之一

全局模式 ~/.openclaw/skills/ 工作区 /skills/

优先级：工作区 > 本地 > 内置

3. 提示词安装

将此提示词复制到 OpenClaw 即可自动安装。

请帮我使用 Clawhub 安装 jina-ai。如果尚未安装 Clawhub，请先安装（npm i -g clawhub）。

Jina AI 阅读器与搜索应用场景

将文档页面或博客文章转换为干净的 Markdown，用于 RAG 或上下文注入。
执行网页搜索，提供可供代理分析的完整页面内容。
使用 DeepSearch 代理对技术主题或时事进行深入的多源研究。
通过定位自定义 CSS 选择器从网页中提取特定数据。
在 Openclaw Skills 内部直接将远程 PDF 文件解析为文本以进行文档处理。

Jina AI 阅读器与搜索工作原理

代理或用户调用其中一个辅助脚本（reader、search 或 deepsearch），并提供目标 URL 或查询。
该技能将 JINA_API_KEY 附加到请求头中，用于 Jina AI 服务的身份验证。
对于 URL 读取，请求被发送到 Reader API，该接口会渲染页面并剥离广告和导航等非核心元素。
对于搜索，查询通过 Search API 处理，以查找最相关的网页结果并将其解析为 Markdown。
对于复杂任务，DeepSearch 会运行多步推理链，从整个网络收集并合成数据。
生成的干净文本或结构化 JSON 将返回到环境中，供代理在其工作流中使用。

Jina AI 阅读器与搜索配置指南

要将此技能集成到您的 Openclaw Skills 环境中，请按照以下步骤操作：

# 1. 从 Jina AI 控制台获取您的 API 密钥 (https://jina.ai/)
# 2. 设置环境变量
export JINA_API_KEY="your_jina_api_key_here"

# 3. 测试阅读器脚本
./scripts/jina-reader.sh https://example.com

# 4. 测试搜索脚本
./scripts/jina-search.sh "关于 Openclaw Skills 的最新消息"

Jina AI 阅读器与搜索数据架构与分类体系

该技能将网页数据组织成针对 AI 消耗优化的结构化格式。下表描述了主要的数据输出：

输出格式	描述
文本/Markdown	默认输出；网页内容的干净、精简版本。
JSON 对象	包含元数据，如 `title`、`url`、`content` 和 `timestamp`。
DeepSearch 响应	包含综合研究结果的 OpenAI 兼容聊天完成对象。
屏幕截图	当使用 `X-Respond-With: screenshot` 标头时对页面的视觉捕捉。

name: jina
description: Web reading and searching via Jina AI APIs. Fetch clean markdown from URLs (r.jina.ai), web search (s.jina.ai), or deep multi-step research (DeepSearch).
homepage: "https://github.com/adhishthite/jina-ai-skill"
metadata:
  {
    "clawdbot":
      {
        "emoji": "??",
        "requires": { "env": ["JINA_API_KEY"] },
        "primaryEnv": "JINA_API_KEY",
        "files": ["scripts/*"],
      },
  }

Jina AI — Reader, Search & DeepSearch

Web reading and search powered by Jina AI. Requires JINA_API_KEY environment variable.

Trust & Privacy: By using this skill, URLs and queries are transmitted to Jina AI (jina.ai). Only install if you trust Jina with your data.

Model Invocation: This skill may be invoked autonomously by the model without explicit user trigger (standard for integration skills). If you prefer manual-only invocation, disable model invocation in your OpenClaw skill settings.

Get your API key: https://jina.ai/ → Dashboard → API Keys

External Endpoints

This skill makes HTTP requests to the following external endpoints only:

Endpoint	URL Pattern	Purpose
Reader API	`https://r.jina.ai/{url}`	Sends URL content request to Jina for conversion to markdown
Search API	`https://s.jina.ai/{query}`	Sends search query to Jina for web search results
DeepSearch API	`https://deepsearch.jina.ai/v1/chat/completions`	Sends research question to Jina for multi-step research

No other external network calls are made by this skill.

Security & Privacy

Authentication: Only your JINA_API_KEY is transmitted to Jina's servers (via Authorization header)
Data sent: URLs and search queries you provide are sent to Jina's servers for processing
Local files: No local files are read or transmitted by this skill
Local storage: No data is stored locally beyond stdout output
Environment access: Scripts only access the JINA_API_KEY environment variable; no other env vars are read
Cookies: Cookies are not forwarded by default; the X-Set-Cookie header is available for authenticated content but is opt-in only

Endpoints

Endpoint	Base URL	Purpose
Reader	`https://r.jina.ai/{url}`	Convert any URL → clean markdown
Search	`https://s.jina.ai/{query}`	Web search with LLM-friendly results
DeepSearch	`https://deepsearch.jina.ai/v1/chat/completions`	Multi-step research agent

All endpoints accept Authorization: Bearer $JINA_API_KEY.

Reader API (`r.jina.ai`)

Fetches any URL and returns clean, LLM-friendly content. Works with web pages, PDFs, and JS-heavy sites.

Basic Usage

# Plain text output
curl -s "https://r.jina.ai/https://example.com" r
  -H "Authorization: Bearer $JINA_API_KEY" r
  -H "Accept: text/plain"

# JSON output (includes url, title, content, timestamp)
curl -s "https://r.jina.ai/https://example.com" r
  -H "Authorization: Bearer $JINA_API_KEY" r
  -H "Accept: application/json"

Or use the helper script: scripts/jina-reader.sh [--json]

Parameters (via headers or query params)

Content Control

Header	Query Param	Values	Default	Description
`X-Respond-With`	`respondWith`	`content`, `markdown`, `html`, `text`, `screenshot`, `pageshot`, `vlm`, `readerlm-v2`	`content`	Output format
`X-Retain-Images`	`retainImages`	`none`, `all`, `alt`, `all_p`, `alt_p`	`all`	Image handling
`X-Retain-Links`	`retainLinks`	`none`, `all`, `text`, `gpt-oss`	`all`	Link handling
`X-With-Generated-Alt`	`withGeneratedAlt`	`true`/`false`	`false`	Auto-caption images
`X-With-Links-Summary`	`withLinksSummary`	`true`	-	Append links section
`X-With-Images-Summary`	`withImagesSummary`	`true`/`false`	`false`	Append images section
`X-Token-Budget`	`tokenBudget`	number	-	Max tokens for response

CSS Selectors

Header	Query Param	Description
`X-Target-Selector`	`targetSelector`	Only extract matching elements
`X-Wait-For-Selector`	`waitForSelector`	Wait for elements before extracting
`X-Remove-Selector`	`removeSelector`	Remove elements before extraction

Browser & Network

Header	Query Param	Description
`X-Timeout`	`timeout`	Page load timeout (1-180s)
`X-Respond-Timing`	`respondTiming`	When page is "ready" (`html`, `network-idle`, etc.)
`X-No-Cache`	`noCache`	Bypass cached content
`X-Proxy`	`proxy`	Country code or `auto` for proxy
`X-Set-Cookie`	`setCookies`	Forward cookies for authenticated content

Common Patterns

# Extract main content, remove navigation elements
curl -s "https://r.jina.ai/https://example.com/article" r
  -H "Authorization: Bearer $JINA_API_KEY" r
  -H "X-Retain-Images: none" r
  -H "X-Remove-Selector: nav, footer, .sidebar, .ads" r
  -H "Accept: text/plain"

# Extract specific section
curl -s "https://r.jina.ai/https://example.com" r
  -H "Authorization: Bearer $JINA_API_KEY" r
  -H "X-Target-Selector: article.main-content"

# Parse a PDF
curl -s "https://r.jina.ai/https://example.com/paper.pdf" r
  -H "Authorization: Bearer $JINA_API_KEY" r
  -H "Accept: text/plain"

# Wait for dynamic content
curl -s "https://r.jina.ai/https://spa-app.com" r
  -H "Authorization: Bearer $JINA_API_KEY" r
  -H "X-Wait-For-Selector: .loaded-content" r
  -H "X-Respond-Timing: network-idle"

Search API (`s.jina.ai`)

Web search returning LLM-friendly results with full page content.

Basic Usage

# Plain text
curl -s "https://s.jina.ai/your+search+query" r
  -H "Authorization: Bearer $JINA_API_KEY" r
  -H "Accept: text/plain"

# JSON
curl -s "https://s.jina.ai/your+search+query" r
  -H "Authorization: Bearer $JINA_API_KEY" r
  -H "Accept: application/json"

Or use the helper script: scripts/jina-search.sh "" [--json]

Search Parameters

Param	Values	Description
`site`	domain	Limit to specific site
`type`	`web`, `images`, `news`	Search type
`num` / `count`	0-20	Number of results
`gl`	country code	Geo-location (e.g. `us`, `in`)
`filetype`	extension	Filter by file type
`intitle`	string	Must appear in title

All Reader parameters also work on search results.

Common Patterns

# Site-scoped search
curl -s "https://s.jina.ai/OpenAI+GPT-5?site=reddit.com" r
  -H "Authorization: Bearer $JINA_API_KEY" r
  -H "Accept: text/plain"

# News search
curl -s "https://s.jina.ai/latest+AI+news?type=news&num=5" r
  -H "Authorization: Bearer $JINA_API_KEY" r
  -H "Accept: application/json"

# Search for PDFs
curl -s "https://s.jina.ai/machine+learning+survey?filetype=pdf&num=5" r
  -H "Authorization: Bearer $JINA_API_KEY"

DeepSearch

Multi-step research agent that combines search + reading + reasoning. OpenAI-compatible chat completions API.

curl -s "https://deepsearch.jina.ai/v1/chat/completions" r
  -H "Authorization: Bearer $JINA_API_KEY" r
  -H "Content-Type: application/json" r
  -d '{
    "model": "jina-deepsearch-v1",
    "messages": [{"role": "user", "content": "Your research question here"}],
    "stream": false
  }'

Or use the helper script: scripts/jina-deepsearch.sh ""

Use for complex research requiring multiple sources and reasoning chains.

Helper Scripts

Script	Purpose
`scripts/jina-reader.sh`	Read any URL as markdown
`scripts/jina-search.sh`	Web search
`scripts/jina-deepsearch.sh`	Deep multi-step research
`scripts/jina-reader.py`	Python reader (no deps beyond stdlib)

Rate Limits

Free (no key): 20 RPM
With API key: Higher limits, token-based pricing

API Docs

Reader: https://jina.ai/reader
Search: https://s.jina.ai/docs
OpenAPI specs: https://r.jina.ai/openapi.json | https://s.jina.ai/openapi.json

When to Use

Need	Use
Fetch a URL as markdown	Reader — better than web_fetch for JS-heavy sites
Web search	Search — LLM-friendly results
Complex multi-source research	DeepSearch
Parse a PDF from URL	Reader — pass PDF URL directly
Screenshot a page	Reader with `X-Respond-With: screenshot`
Extract structured data	Reader with `jsonSchema` param

上一篇：Claw Arena: AI 智能体对战与编程竞赛 - Openclaw Skills 下一篇：Claw Werewolf Live：AI 机器人游戏综艺秀 - Openclaw Skills

Jina AI 技能：网页阅读、搜索与深度搜索 - Openclaw Skills

什么是 Jina AI 阅读器与搜索？

安装与下载

1. ClawHub CLI

2. 手动安装

3. 提示词安装

Jina AI 阅读器与搜索 应用场景

Jina AI 阅读器与搜索 配置指南

Jina AI 阅读器与搜索 数据架构与分类体系

Jina AI — Reader, Search & DeepSearch

External Endpoints

Security & Privacy

Endpoints

Reader API (r.jina.ai)

Basic Usage

Parameters (via headers or query params)

Content Control

CSS Selectors

Browser & Network

Common Patterns

Search API (s.jina.ai)

Basic Usage

Search Parameters

Common Patterns

DeepSearch

Helper Scripts

Rate Limits

API Docs

When to Use

Jina AI 阅读器与搜索应用场景

Jina AI 阅读器与搜索配置指南

Jina AI 阅读器与搜索数据架构与分类体系

Reader API (`r.jina.ai`)

Search API (`s.jina.ai`)