DuckDuckGo 网页搜索:AI 智能体私密搜索 - Openclaw Skills

作者:互联网

2026-03-31

AI教程

什么是 DuckDuckGo 网页搜索?

DuckDuckGo 网页搜索技能为 AI 智能体访问互联网的海量知识提供了一个安全且匿名的接口。通过利用 DuckDuckGo API 和先进的 HTML 抓取方法,Openclaw Skills 生态系统中的这一集成允许在不需要 API 密钥的情况下进行高质量的信息检索。对于优先考虑隐私并需要可靠方式验证事实或发现新 URL 的开发人员来说,它是理想的解决方案。

此技能旨在处理各种搜索意图,从简单查询到复杂的研究任务。它综合了来自即时答案、相关主题和深度网络索引的结果,确保您的 AI 智能体能够访问最相关的数据,同时保持零搜索历史和无用户分析。将此技能作为 Openclaw Skills 库的一部分使用,可确保您的工作流程保持私密且高效。

下载入口:https://github.com/openclaw/skills/tree/main/skills/omprasad122007-rgb/ddg-search-privacy

安装与下载

1. ClawHub CLI

从源直接安装技能的最快方式。

npx clawhub@latest install ddg-search-privacy

2. 手动安装

将技能文件夹复制到以下位置之一

全局模式 ~/.openclaw/skills/ 工作区 /skills/

优先级:工作区 > 本地 > 内置

3. 提示词安装

将此提示词复制到 OpenClaw 即可自动安装。

请帮我使用 Clawhub 安装 ddg-search-privacy。如果尚未安装 Clawhub,请先安装(npm i -g clawhub)。

DuckDuckGo 网页搜索 应用场景

  • 为 AI 回复进行私密网页研究和自动事实核查。
  • 使用特定站点的搜索运算符查找技术文档。
  • 使用新闻搜索模式监控当前事件和新闻更新。
  • 发现特定文件类型,如 PDF 报告或 Markdown 文档。
  • 为视觉内容工作流收集图像资产和缩略图。
DuckDuckGo 网页搜索 工作原理
  1. AI 智能体识别出对外部数据的需求并触发搜索技能。
  2. 使用 DuckDuckGo 搜索语法优化查询,例如用于精确匹配的引号或特定于站点的过滤器。
  3. 该技能向 DuckDuckGo JSON API(用于即时回答)或 HTML 界面(用于综合结果列表)执行请求。
  4. 对结果进行解析和过滤,以提取基本元数据,包括标题、片段和源 URL。
  5. 收集到的信息以结构化格式返回,允许智能体提供有根据且准确的答案。

DuckDuckGo 网页搜索 配置指南

要开始使用此搜索功能,请确保您的环境中安装了必要的 Python 依赖项。此技能依赖于 requests 库进行 API 通信,并依赖于 BeautifulSoup 进行结果提取。

pip install requests beautifulsoup4

安装完成后,您可以将搜索函数导入到您的项目中。由于不需要 API 密钥,您可以立即使用标准的 Openclaw Skills 集成模式开始进行查询。

DuckDuckGo 网页搜索 数据架构与分类体系

该技能返回专为 LLM 和自动化系统设计的结构化数据。以下是网页结果的主要数据结构:

属性 类型 描述
title 字符串 搜索结果或网站的标题
url 字符串 指向来源的直接超链接
snippet 字符串 页面内容的简明摘要或摘录
type 字符串 结果的分类(例如,instant_answer、related)

对于图像特定的搜索,架构扩展为包含 thumbnailsource 属性,以方便媒体处理。

name: duckduckgo-search
version: 1.0.0
description: DuckDuckGo web search for private tracker-free searching. Use when user asks to search the web find information online or perform web-based research without tracking. Ideal for web search queries finding online information research without tracking quick fact verification and URL discovery.

Private web search using DuckDuckGo API for tracker-free information retrieval.

Core Features

  • Privacy-focused search (no tracking)
  • Instant answer support
  • Multiple search modes (web, images, videos, news)
  • JSON output for easy parsing
  • No API key required

Quick Start

import requests

def search_duckduckgo(query, max_results=10):
    """
    Perform DuckDuckGo search and return results.

    Args:
        query: Search query string
        max_results: Maximum number of results to return (default: 10)

    Returns:
        List of search results with title, url, description
    """
    url = "https://api.duckduckgo.com/"
    params = {
        "q": query,
        "format": "json",
        "no_html": 1,
        "skip_disambig": 0
    }

    response = requests.get(url, params=params)
    data = response.json()

    # Extract results
    results = []

    # Abstract (instant answer)
    if data.get("Abstract"):
        results.append({
            "type": "instant_answer",
            "title": "Instant Answer",
            "content": data["Abstract"],
            "source": data.get("AbstractSource", "DuckDuckGo")
        })

    # Related topics
    if data.get("RelatedTopics"):
        for topic in data["RelatedTopics"][:max_results]:
            if isinstance(topic, dict) and topic.get("Text"):
                results.append({
                    "type": "related",
                    "title": topic.get("FirstURL", "").split("/")[-1].replace("-", " ").title(),
                    "content": topic["Text"],
                    "url": topic.get("FirstURL", "")
                })

    return results[:max_results]

Advanced Usage (HTML Scraping)

from bs4 import BeautifulSoup
import requests

def search_with_results(query, max_results=10):
    """
    Perform DuckDuckGo search and scrape actual results.

    Args:
        query: Search query string
        max_results: Maximum number of results to return

    Returns:
        List of search results with title, url, snippet
    """
    url = "https://duckduckgo.com/html/"
    params = {"q": query}

    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
    }

    response = requests.post(url, data=params, headers=headers)
    soup = BeautifulSoup(response.text, "html.parser")

    results = []
    for result in soup.find_all("a", class_="result__a", href=True)[:max_results]:
        results.append({
            "title": result.get_text(),
            "url": result["href"],
            "snippet": result.find_parent("div", class_="result__body").get_text().strip()
        })

    return results

Search Operators

DuckDuckGo supports standard search operators:

Operator Example Description
"" "exact phrase" Exact phrase match
- python -django Exclude terms
site: site:wikipedia.org history Search specific site
filetype: filetype:pdf report Specific file types
intitle: intitle:openclaw Words in title
inurl: inurl:docs/ Words in URL
OR docker OR kubernetes Either term

Search Modes

Default mode, searches across the web.

search_with_results("machine learning tutorial")
def search_images(query, max_results=10):
    url = "https://duckduckgo.com/i.js"
    params = {
        "q": query,
        "o": "json",
        "vqd": "",  # Will be populated
        "f": ",,,",
        "p": "1"
    }

    response = requests.get(url, params=params)
    data = response.json()

    results = []
    for result in data.get("results", [])[:max_results]:
        results.append({
            "title": result.get("title", ""),
            "url": result.get("image", ""),
            "thumbnail": result.get("thumbnail", ""),
            "source": result.get("source", "")
        })

    return results

Add !news to the query:

search_duckduckgo("artificial intelligence !news")

Best Practices

Query Construction

Good queries:

  • "DuckDuckGo API documentation" 2024 (specific, recent)
  • site:github.com openclaw issues (targeted)
  • python machine learning tutorial filetype:pdf (resource-specific)

Avoid:

  • Vague single words ("search", "find")
  • Overly complex operators that might confuse results
  • Questions with multiple unrelated topics

Privacy Considerations

DuckDuckGo advantages:

  • ? No personal tracking
  • ? No search history stored
  • ? No user profiling
  • ? No forced personalized results

Performance Tips

  1. Use HTML scraping for actual results - The JSON API provides instant answers but limited result lists
  2. Add appropriate delays - Respect rate limits when making multiple queries
  3. Cache results - Store common searches to avoid repeated API calls

Error Handling

def search_safely(query, retries=3):
    for attempt in range(retries):
        try:
            results = search_with_results(query)
            if results:
                return results
        except Exception as e:
            if attempt == retries - 1:
                raise
            time.sleep(2 ** attempt)  # Exponential backoff

    return []

Output Formatting

Markdown Format

def format_results_markdown(results, query):
    output = f"# Search Results for: {query}

"

    for i, result in enumerate(results, 1):
        output += f"## {i}. {result.get('title', 'Untitled')}

"
        output += f"**URL:** {result.get('url', 'N/A')}

"
        output += f"{result.get('snippet', result.get('content', 'N/A'))}

"
        output += "---

"

    return output

JSON Format

import json

def format_results_json(results, query):
    return json.dumps({
        "query": query,
        "count": len(results),
        "results": results,
        "timestamp": datetime.now().isoformat()
    }, indent=2)

Common Patterns

Find Documentation

search_duckduckgo(f'{library_name} documentation filetype:md')

Recent Information

search_duckduckgo(f'{topic} 2024 news')

Troubleshooting

search_duckduckgo(f'{error_message} {tool_name} stackoverflow')

Technical Comparison

search_duckduckgo('postgresql vs mysql performance 2024')

Integration Example

class DuckDuckGoSearcher:
    def __init__(self):
        self.session = requests.Session()
        self.session.headers.update({
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
        })

    def search(self, query, mode="web", max_results=10):
        """
        Unified search interface.

        Args:
            query: Search query
            mode: 'web', 'images', 'news'
            max_results: Maximum results

        Returns:
            Formatted results as list
        """
        if mode == "images":
            return self._search_images(query, max_results)
        elif mode == "news":
            return self._search_web(f"{query} !news", max_results)
        else:
            return self._search_web(query, max_results)

    def _search_web(self, query, max_results):
        # Implementation
        pass

    def _search_images(self, query, max_results):
        # Implementation
        pass

Resources

Official Documentation

  • DuckDuckGo API Wiki: https://duckduckgo.com/api
  • Instant Answer API: https://duckduckgo.com/params
  • Search Syntax: https://help.duckduckgo.com/duckduckgo-help-pages/results/syntax/

References

  • HTML scraping patterns for result extraction
  • Rate limiting best practices
  • Result parsing and filtering