DeepRead OCR:AI驱动的文档提取 - Openclaw Skills

作者:互联网

2026-03-20

AI教程

什么是 DeepRead OCR?

DeepRead 是一款生产级的文档处理 API,旨在将复杂的 PDF 和图像转换为整洁的结构化 JSON 数据。通过利用多模型共识和多轮验证,它开箱即用的准确率超过 97%。作为用于数据提取的最强大的 Openclaw 技能之一,它消除了复杂的提示词工程需求,同时提供了一种可靠的机制来标记不确定的数据供人工审查。

该平台专为需要扩展文档工作流程的开发人员构建。它处理从文档旋转和方向校正到字段级置信度评分的所有事务。通过集成 DeepRead,团队可以将手动数据录入工作从 100% 减少到仅 5%,仅专注于 AI 标记的异常情况。

下载入口:https://github.com/openclaw/skills/tree/main/skills/deepread001/deepread-ocr

安装与下载

1. ClawHub CLI

从源直接安装技能的最快方式。

npx clawhub@latest install deepread-ocr

2. 手动安装

将技能文件夹复制到以下位置之一

全局模式 ~/.openclaw/skills/ 工作区 /skills/

优先级:工作区 > 本地 > 内置

3. 提示词安装

将此提示词复制到 OpenClaw 即可自动安装。

请帮我使用 Clawhub 安装 deepread-ocr。如果尚未安装 Clawhub,请先安装(npm i -g clawhub)。

DeepRead OCR 应用场景

  • 具有行项目提取功能的自动发票和收据处理。
  • 从合同中提取法律当事人、生效日期和终止条款。
  • 将纸质表格和遗留文档转换为结构化的数据库条目。
  • 用于搜索和 RAG 应用的高精度文档转 Markdown 转换。
  • 构建需要置信度分数和人工在环验证的质量关键型应用。
DeepRead OCR 工作原理
  1. 文档提交:用户通过 API 上传 PDF 或图像,并可选择为特定字段提取提供 JSON 模式。
  2. 预处理:系统自动校正文档旋转和方向,以确保最佳 OCR 性能。
  3. 多轮分析:文档经过多次 OCR 处理和不同 AI 模型之间的交叉验证,以达成文本共识。
  4. 结构化提取:数据被映射到请求的模式,AI 为每个提取的字段分配置信度标记(hil_flag)。
  5. 结果获取:处理(通常 2-5 分钟)完成后,结构化数据通过 Webhook 交付或可供轮询获取。

DeepRead OCR 配置指南

要开始使用这些 Openclaw 技能,请先在 DeepRead 控制面板注册以获取 API 密钥。获取密钥后,将其设置为环境变量:

export DEEPREAD_API_KEY="sk_live_your_key_here"

对于 Clawdbot 用户,可以通过将技能添加到配置文件来启用它:

{
  skills: {
    entries: {
      "deepread": {
        enabled: true
      }
    }
  }
}

DeepRead OCR 数据架构与分类体系

DeepRead 返回一个全面的 JSON 对象,该对象组织了提取的数据以及质量元数据。该模式确保开发人员可以轻松区分高置信度数据和需要人工干预的字段。

组件 描述
text 转换为整洁 Markdown 格式的完整文档内容。
data 提取字段的集合,每个字段包含 valuehil_flagfound_on_page 索引。
hil_flag 一个布尔值,指示提取是否不确定并需要人工在环审查。
pages 一个数组,为多页文档提供逐页 OCR 文本和特定质量标记。
metadata 汇总统计信息,包括字段总数和需要审查的百分比。
name: deepread
title: DeepRead OCR
description: AI-native OCR platform that turns documents into high-accuracy data in minutes. Using multi-model consensus, DeepRead achieves 97%+ accuracy and flags only uncertain fields for Human-in-the-Loop (HIL) review—reducing manual work from 100% to 5-10%. Zero prompt engineering required.
disable-model-invocation: true
metadata:
  {"openclaw":{"requires":{"env":["DEEPREAD_API_KEY"]},"primaryEnv":"DEEPREAD_API_KEY","homepage":"https://www.deepread.tech"}}

DeepRead - Production OCR API

DeepRead is an AI-native OCR platform that turns documents into high-accuracy data in minutes. Using multi-model consensus, DeepRead achieves 97%+ accuracy and flags only uncertain fields for Human-in-the-Loop (HIL) review—reducing manual work from 100% to 5-10%. Zero prompt engineering required.

What This Skill Does

DeepRead is a production-grade document processing API that gives you high-accuracy structured data output in minutes with human review flagging so manual review is limited to the flagged exceptions

Core Features:

  • Text Extraction: Convert PDFs and images to clean markdown
  • Structured Data: Extract JSON fields with confidence scores
  • HIL Interface: Built-in Human-in-the-Loop review — uncertain fields are flagged (hil_flag) so only exceptions need manual review
  • Multi-Pass Processing: Multiple validation passes for maximum accuracy
  • Multi-Model Consensus: Cross-validation between models for reliability
  • Free Tier: 2,000 pages/month (no credit card required)

Setup

1. Get Your API Key

Sign up and create an API key:

# Visit the dashboard
https://www.deepread.tech/dashboard

# Or use this direct link
https://www.deepread.tech/dashboard/?utm_source=clawdhub

Save your API key:

export DEEPREAD_API_KEY="sk_live_your_key_here"

2. Clawdbot Configuration (Optional)

Add to your clawdbot.config.json5:

{
  skills: {
    entries: {
      "deepread": {
        enabled: true
        // API key is read from DEEPREAD_API_KEY environment variable
        // Do NOT hardcode your API key here
      }
    }
  }
}

3. Process Your First Document

Option A: With Webhook (Recommended)

# Upload PDF with webhook notification
curl -X POST https://api.deepread.tech/v1/process r
  -H "X-API-Key: $DEEPREAD_API_KEY" r
  -F "file=@document.pdf" r
  -F "webhook_url=https://your-app.com/webhooks/deepread"

# Returns immediately
{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "queued"
}

# Your webhook receives results when processing completes (2-5 minutes)

Option B: Poll for Results

# Upload PDF without webhook
curl -X POST https://api.deepread.tech/v1/process r
  -H "X-API-Key: $DEEPREAD_API_KEY" r
  -F "file=@document.pdf"

# Returns immediately
{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "queued"
}

# Poll until completed
curl https://api.deepread.tech/v1/jobs/550e8400-e29b-41d4-a716-446655440000 r
  -H "X-API-Key: $DEEPREAD_API_KEY"

Usage Examples

Basic OCR (Text Only)

Extract text as clean markdown:

# With webhook (recommended)
curl -X POST https://api.deepread.tech/v1/process r
  -H "X-API-Key: $DEEPREAD_API_KEY" r
  -F "file=@invoice.pdf" r
  -F "webhook_url=https://your-app.com/webhook"

# OR poll for completion
curl -X POST https://api.deepread.tech/v1/process r
  -H "X-API-Key: $DEEPREAD_API_KEY" r
  -F "file=@invoice.pdf"

# Then poll
curl https://api.deepread.tech/v1/jobs/JOB_ID r
  -H "X-API-Key: $DEEPREAD_API_KEY"

Response when completed:

{
  "id": "550e8400-...",
  "status": "completed",
  "result": {
    "text": "# INVOICE

**Vendor:** Acme Corp
**Total:** $1,250.00..."
  }
}

Structured Data Extraction

Extract specific fields with confidence scoring:

curl -X POST https://api.deepread.tech/v1/process r
  -H "X-API-Key: $DEEPREAD_API_KEY" r
  -F "file=@invoice.pdf" r
  -F 'schema={
    "type": "object",
    "properties": {
      "vendor": {
        "type": "string",
        "description": "Vendor company name"
      },
      "total": {
        "type": "number",
        "description": "Total invoice amount"
      },
      "invoice_date": {
        "type": "string",
        "description": "Invoice date in MM/DD/YYYY format"
      }
    }
  }'

Response includes confidence flags:

{
  "status": "completed",
  "result": {
    "text": "# INVOICE

**Vendor:** Acme Corp...",
    "data": {
      "vendor": {
        "value": "Acme Corp",
        "hil_flag": false,
        "found_on_page": 1
      },
      "total": {
        "value": 1250.00,
        "hil_flag": false,
        "found_on_page": 1
      },
      "invoice_date": {
        "value": "2024-10-??",
        "hil_flag": true,
        "reason": "Date partially obscured",
        "found_on_page": 1
      }
    },
    "metadata": {
      "fields_requiring_review": 1,
      "total_fields": 3,
      "review_percentage": 33.3
    }
  }
}

Complex Schemas (Nested Data)

Extract arrays and nested objects:

curl -X POST https://api.deepread.tech/v1/process r
  -H "X-API-Key: $DEEPREAD_API_KEY" r
  -F "file=@invoice.pdf" r
  -F 'schema={
    "type": "object",
    "properties": {
      "vendor": {"type": "string"},
      "total": {"type": "number"},
      "line_items": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "description": {"type": "string"},
            "quantity": {"type": "number"},
            "price": {"type": "number"}
          }
        }
      }
    }
  }'

Page-by-Page Breakdown

Get per-page OCR results with quality flags:

curl -X POST https://api.deepread.tech/v1/process r
  -H "X-API-Key: $DEEPREAD_API_KEY" r
  -F "file=@contract.pdf" r
  -F "include_pages=true"

Response:

{
  "result": {
    "text": "Combined text from all pages...",
    "pages": [
      {
        "page_number": 1,
        "text": "# Contract Agreement

...",
        "hil_flag": false
      },
      {
        "page_number": 2,
        "text": "Terms and C??diti??s...",
        "hil_flag": true,
        "reason": "Multiple unrecognized characters"
      }
    ],
    "metadata": {
      "pages_requiring_review": 1,
      "total_pages": 2
      }
  }
}

When to Use This Skill

? Use DeepRead For:

  • Invoice Processing: Extract vendor, totals, line items
  • Receipt OCR: Parse merchant, items, totals
  • Contract Analysis: Extract parties, dates, terms
  • Form Digitization: Convert paper forms to structured data
  • Document Workflows: Any process requiring OCR + data extraction
  • Quality-Critical Apps: When you need to know which extractions are uncertain

? Don't Use For:

  • Real-time Processing: Processing takes 2-5 minutes (async workflow)
  • Batch >2,000 pages/month: Upgrade to PRO or SCALE tier

How It Works

Multi-Pass Pipeline

PDF → Convert → Rotate Correction → OCR → Multi-Model Validation → Extract → Done

The pipeline automatically handles:

  • Document rotation and orientation correction
  • Multi-pass validation for accuracy
  • Cross-model consensus for reliability
  • Field-level confidence scoring

Human-in-the-Loop (HIL) Interface

DeepRead includes a built-in Human-in-the-Loop (HIL) review system. The AI compares extracted text to the original image and sets hil_flag on each field:

  • hil_flag: false = Clear, confident extraction → Auto-process
  • hil_flag: true = Uncertain extraction → Routed to human review

How HIL works:

  1. Fields extracted with high confidence are auto-approved
  2. Uncertain fields are flagged with hil_flag: true and a reason
  3. Only flagged fields need human review (typically 5-10% of total fields)
  4. Review flagged fields in DeepRead Preview (preview.deepread.tech) — a dedicated HIL review interface where reviewers can see the original document side-by-side with extracted data, correct flagged fields, and approve results
  5. Or integrate with your own review queue using the hil_flag data in the API response

AI flags extractions when:

  • Text is handwritten, blurry, or low quality
  • Multiple possible interpretations exist
  • Characters are partially visible or unclear
  • Field not found in document

This is multimodal AI determination, not rule-based.

Advanced Features

1. Blueprints (Optimized Schemas)

Create reusable, optimized schemas for specific document types:

# List your blueprints
curl https://api.deepread.tech/v1/blueprints r
  -H "X-API-Key: $DEEPREAD_API_KEY"

# Use blueprint instead of inline schema
curl -X POST https://api.deepread.tech/v1/process r
  -H "X-API-Key: $DEEPREAD_API_KEY" r
  -F "file=@invoice.pdf" r
  -F "blueprint_id=660e8400-e29b-41d4-a716-446655440001"

Benefits:

  • 20-30% accuracy improvement over baseline schemas
  • Reusable across similar documents
  • Versioned with rollback support

How to create blueprints:

# Create a blueprint from training data
curl -X POST https://api.deepread.tech/v1/optimize r
  -H "X-API-Key: $DEEPREAD_API_KEY" r
  -H "Content-Type: application/json" r
  -d '{
    "name": "utility_invoice",
    "description": "Optimized for utility invoices",
    "document_type": "invoice",
    "initial_schema": {
      "type": "object",
      "properties": {
        "vendor": {"type": "string", "description": "Vendor name"},
        "total": {"type": "number", "description": "Total amount"}
      }
    },
    "training_documents": ["doc1.pdf", "doc2.pdf", "doc3.pdf"],
    "ground_truth_data": [
      {"vendor": "Acme Power", "total": 125.50},
      {"vendor": "City Electric", "total": 89.25}
    ],
    "target_accuracy": 95.0,
    "max_iterations": 5
  }'

# Returns: {"job_id": "...", "blueprint_id": "...", "status": "pending"}

# Check optimization status
curl https://api.deepread.tech/v1/blueprints/jobs/JOB_ID r
  -H "X-API-Key: $DEEPREAD_API_KEY"

# Use blueprint (once completed)
curl -X POST https://api.deepread.tech/v1/process r
  -H "X-API-Key: $DEEPREAD_API_KEY" r
  -F "file=@invoice.pdf" r
  -F "blueprint_id=BLUEPRINT_ID"

Get notified when processing completes instead of polling:

curl -X POST https://api.deepread.tech/v1/process r
  -H "X-API-Key: $DEEPREAD_API_KEY" r
  -F "file=@invoice.pdf" r
  -F "webhook_url=https://your-app.com/webhooks/deepread"

Your webhook receives this payload when processing completes:

{
  "job_id": "550e8400-...",
  "status": "completed",
  "created_at": "2025-01-27T10:00:00Z",
  "completed_at": "2025-01-27T10:02:30Z",
  "result": {
    "text": "...",
    "data": {...}
  },
  "preview_url": "https://preview.deepread.tech/abc1234"
}

Benefits:

  • No polling required
  • Instant notification when done
  • Lower latency
  • Better for production workflows

3. Preview (HIL Review Interface)

DeepRead Preview (preview.deepread.tech) is the built-in Human-in-the-Loop review interface. Reviewers can view the original document alongside extracted data, correct flagged fields, and approve results. Preview URLs can also be shared without authentication:

# Request preview URL
curl -X POST https://api.deepread.tech/v1/process r
  -H "X-API-Key: $DEEPREAD_API_KEY" r
  -F "file=@document.pdf" r
  -F "include_images=true"

# Get preview URL in response
{
  "result": {
    "text": "...",
    "data": {...}
  },
  "preview_url": "https://preview.deepread.tech/Xy9aB12"
}

Public Preview Endpoint:

# No authentication required
curl https://api.deepread.tech/v1/preview/Xy9aB12

Rate Limits & Pricing

Free Tier (No Credit Card)

  • 2,000 pages/month
  • 10 requests/minute
  • Full feature access (OCR + structured extraction + blueprints)
  • PRO: 50,000 pages/month, 100 requests/minute @ $99/mo
  • SCALE: Custom volume pricing (contact sales)

Upgrade: https://www.deepread.tech/dashboard/billing?utm_source=clawdhub

Rate Limit Headers

Every response includes quota information:

X-RateLimit-Limit: 2000
X-RateLimit-Remaining: 1847
X-RateLimit-Used: 153
X-RateLimit-Reset: 1730419200

Best Practices

1. Use Webhooks for Production

? Recommended: Webhook notifications

curl -X POST https://api.deepread.tech/v1/process r
  -H "X-API-Key: $DEEPREAD_API_KEY" r
  -F "file=@document.pdf" r
  -F "webhook_url=https://your-app.com/webhook"

Only use polling if:

  • Testing/development
  • Cannot expose a webhook endpoint
  • Need synchronous response

2. Schema Design

? Good: Descriptive field descriptions

{
  "vendor": {
    "type": "string",
    "description": "Vendor company name. Usually in header or top-left of invoice."
  }
}

? Bad: No description

{
  "vendor": {"type": "string"}
}

3. Polling Strategy (If Needed)

Only if you can't use webhooks, poll every 5-10 seconds:

import time
import requests

def wait_for_result(job_id, api_key):
    while True:
        response = requests.get(
            f"https://api.deepread.tech/v1/jobs/{job_id}",
            headers={"X-API-Key": api_key}
        )
        result = response.json()

        if result["status"] == "completed":
            return result["result"]
        elif result["status"] == "failed":
            raise Exception(f"Job failed: {result.get('error')}")

        time.sleep(5)

4. Handling Quality Flags

Separate confident fields from uncertain ones:

def process_extraction(data):
    confident = {}
    needs_review = []

    for field, field_data in data.items():
        if field_data["hil_flag"]:
            needs_review.append({
                "field": field,
                "value": field_data["value"],
                "reason": field_data.get("reason")
            })
        else:
            confident[field] = field_data["value"]

    # Auto-process confident fields
    save_to_database(confident)

    # Send uncertain fields to review queue
    if needs_review:
        send_to_review_queue(needs_review)

Troubleshooting

Error: quota_exceeded

{"detail": "Monthly page quota exceeded"}

Solution: Upgrade to PRO or wait until next billing cycle.

Error: invalid_schema

{"detail": "Schema must be valid JSON Schema"}

Solution: Ensure schema is valid JSON and includes type and properties.

Error: file_too_large

{"detail": "File size exceeds 50MB limit"}

Solution: Compress PDF or split into smaller files.

Job Status: failed

{"status": "failed", "error": "PDF could not be processed"}

Common causes:

  • Corrupted PDF file
  • Password-protected PDF
  • Unsupported PDF version
  • Image quality too low for OCR

Example Schema Templates

Invoice Schema

{
  "type": "object",
  "properties": {
    "invoice_number": {
      "type": "string",
      "description": "Unique invoice ID"
    },
    "invoice_date": {
      "type": "string",
      "description": "Invoice date in MM/DD/YYYY format"
    },
    "vendor": {
      "type": "string",
      "description": "Vendor company name"
    },
    "total": {
      "type": "number",
      "description": "Total amount due including tax"
    },
    "line_items": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "description": {"type": "string"},
          "quantity": {"type": "number"},
          "price": {"type": "number"}
        }
      }
    }
  }
}

Receipt Schema

{
  "type": "object",
  "properties": {
    "merchant": {
      "type": "string",
      "description": "Store or merchant name"
    },
    "date": {
      "type": "string",
      "description": "Transaction date"
    },
    "total": {
      "type": "number",
      "description": "Total amount paid"
    },
    "items": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "name": {"type": "string"},
          "price": {"type": "number"}
        }
      }
    }
  }
}

Contract Schema

{
  "type": "object",
  "properties": {
    "parties": {
      "type": "array",
      "items": {"type": "string"},
      "description": "Names of all parties in the contract"
    },
    "effective_date": {
      "type": "string",
      "description": "Contract start date"
    },
    "term_length": {
      "type": "string",
      "description": "Duration of contract"
    },
    "termination_clause": {
      "type": "string",
      "description": "Conditions for termination"
    }
  }
}

Support & Resources

  • GitHub: https://github.com/deepread-tech
  • Issues: https://github.com/deepread-tech/deep-read-service/issues
  • Email: hello@deepread.tech

Important Notes

  • Processing Time: 2-5 minutes (async, not real-time)
  • Async Workflow: Use webhooks (recommended) or polling
  • Rate Limits: 10 req/min on free tier
  • File Size Limit: 50MB per file
  • Supported Formats: PDF, JPG, JPEG, PNG

Ready to start? Get your free API key at https://www.deepread.tech/dashboard/?utm_source=clawdhub