baoyu-image-gen：多提供商 AI 图像生成 - Openclaw Skills-脚本在线

AI智能体脚本智能办公脚本自动化游戏脚本浏览器自动化脚本服务器脚本

baoyu-image-gen：多提供商 AI 图像生成 - Openclaw Skills

作者：互联网

2026-03-29

AI教程

什么是 baoyu-image-gen？

baoyu-image-gen 是为 Openclaw Skills 设计的强大工具，使 AI 代理能够使用行业领先的 API 创建高质量的视觉内容。它支持广泛的提供商，包括 OpenAI、Google (Gemini 和 Imagen)、DashScope (阿里云) 和 Replicate。无论您需要简单的文本转图像生成，还是使用参考图像进行复杂的模型编辑，此技能都提供了一个统一的接口来无缝处理各种模型要求。

通过将其集成到您的工作流程中，您可以直接从 AI 代理环境自动创建营销资产、UI 占位符或插图内容。它通过 EXTEND.md 具有复杂的配置系统，允许对 Openclaw Skills 生态系统内不同项目和环境的默认模型、质量预设和输出尺寸进行精细控制。

下载入口:https://github.com/openclaw/skills/tree/main/skills/peters820-art/baoyu-image-gen

安装与下载

1. ClawHub CLI

从源直接安装技能的最快方式。

npx clawhub@latest install baoyu-image-gen

2. 手动安装

将技能文件夹复制到以下位置之一

全局模式 ~/.openclaw/skills/ 工作区 /skills/

优先级：工作区 > 本地 > 内置

3. 提示词安装

将此提示词复制到 OpenClaw 即可自动安装。

请帮我使用 Clawhub 安装 baoyu-image-gen。如果尚未安装 Clawhub，请先安装（npm i -g clawhub）。

baoyu-image-gen 应用场景

根据文本提示自动创建博客文章插图和社交媒体图形。
在开发过程中生成 UI/UX 设计资产和占位符。
使用参考图像进行多模态风格迁移，迭代视觉概念。
使用并行执行为高容量项目批量制作大量图像。
在 Google 和 OpenAI 等提供商之间切换，为特定提示找到最佳美感。

baoyu-image-gen 工作原理

技能通过检查项目或用户目录中的 EXTEND.md 配置文件来加载首选默认设置。
如果未找到配置，它会触发阻塞设置过程以定义 API 密钥、提供商和默认保存位置。
代理收到带有提示词和可选参数（如宽高比、质量或参考图像）的命令。
它根据提供的参数、可用的 API 密钥或明确的用户标志选择最佳提供商。
脚本通过选择的 API（OpenAI、Google、DashScope 或 Replicate）执行生成。
对于大批量任务，如果明确要求，技能可以启动多个子代理执行并行生成。
最终图像保存到指定路径，并向代理返回操作的 JSON 摘要。

baoyu-image-gen 配置指南

要开始使用 Openclaw Skills 的此组件，请确保您拥有首选提供商所需的 API 密钥。

# 技能将在首次运行时自动引导您完成设置
npx -y bun scripts/main.ts --prompt "A serene mountain landscape" --image landscape.png

# 手动设置环境变量：
export OPENAI_API_KEY='your-key'
export GOOGLE_API_KEY='your-key'

baoyu-image-gen 数据架构与分类体系

该技能通过 CLI 标志和配置文件组织其操作，以确保 Openclaw Skills 内部的一致输出。

参数	类型	描述
--prompt	字符串	要生成的图像的文本描述。
--image	路径	文件所需的输出目的地。
--ar	比例	支持：1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1。
--quality	预设	'normal' (1K) 或 '2k' (2K 分辨率)。
--ref	文件	用于多模态任务的参考图像 (Google/OpenAI)。
--provider	字符串	强制使用 google, openai, dashscope 或 replicate。

name: baoyu-image-gen description: AI image generation with OpenAI, Google, DashScope and Replicate APIs. Supports text-to-image, reference images, aspect ratios. Sequential by default; parallel generation available on request. Use when user asks to generate, create, or draw images.

Image Generation (AI SDK)

Official API-based image generation. Supports OpenAI, Google, DashScope (阿里通义万象) and Replicate providers.

Script Directory

Agent Execution:

SKILL_DIR = this SKILL.md file's directory
Script path = ${SKILL_DIR}/scripts/main.ts

Step 0: Load Preferences ? BLOCKING

CRITICAL: This step MUST complete BEFORE any image generation. Do NOT skip or defer.

Check EXTEND.md existence (priority: project → user):

test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"

Result	Action
Found	Load, parse, apply settings. If `default_model.[provider]` is null → ask model only (Flow 2)
Not found	? Run first-time setup (references/config/first-time-setup.md) → Save EXTEND.md → Then continue

CRITICAL: If not found, complete the full setup (provider + model + quality + save location) using AskUserQuestion BEFORE generating any images. Generation is BLOCKED until EXTEND.md is created.

Path	Location
`.baoyu-skills/baoyu-image-gen/EXTEND.md`	Project directory
`$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md`	User home

EXTEND.md Supports: Default provider | Default quality | Default aspect ratio | Default image size | Default models

Schema: references/config/preferences-schema.md

Usage

# Basic
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image cat.png

# With aspect ratio
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9

# High quality
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k

# From prompt files
npx -y bun ${SKILL_DIR}/scripts/main.ts --promptfiles system.md content.md --image out.png

# With reference images (Google multimodal or OpenAI edits)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png

# With reference images (explicit provider/model)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png

# Specific provider
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider openai

# DashScope (阿里通义万象)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope

# Replicate (google/nano-banana-pro)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate

# Replicate with specific model
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana

Options

Option	Description
`--prompt` , `-p`	Prompt text
`--promptfiles`	Read prompt from files (concatenated)
`--image`	Output image path (required)
`--provider google\|openai\|dashscope\|replicate`	Force provider (default: google)
`--model` , `-m`	Model ID (Google: `gemini-3-pro-image-preview`, `gemini-3.1-flash-image-preview`; OpenAI: `gpt-image-1.5`)
`--ar`	Aspect ratio (e.g., `16:9`, `1:1`, `4:3`)
`--size`	Size (e.g., `1024x1024`)
`--quality normal\|2k`	Quality preset (default: 2k)
`--imageSize 1K\|2K\|4K`	Image size for Google (default: from quality)
`--ref`	Reference images. Supported by Google multimodal (`gemini-3-pro-image-preview`, `gemini-3-flash-preview`, `gemini-3.1-flash-image-preview`) and OpenAI edits (GPT Image models). If provider omitted: Google first, then OpenAI
`--n`	Number of images
`--json`	JSON output

Environment Variables

Variable	Description
`OPENAI_API_KEY`	OpenAI API key
`GOOGLE_API_KEY`	Google API key
`DASHSCOPE_API_KEY`	DashScope API key (阿里云)
`REPLICATE_API_TOKEN`	Replicate API token
`OPENAI_IMAGE_MODEL`	OpenAI model override
`GOOGLE_IMAGE_MODEL`	Google model override
`DASHSCOPE_IMAGE_MODEL`	DashScope model override (default: z-image-turbo)
`REPLICATE_IMAGE_MODEL`	Replicate model override (default: google/nano-banana-pro)
`OPENAI_BASE_URL`	Custom OpenAI endpoint
`GOOGLE_BASE_URL`	Custom Google endpoint
`DASHSCOPE_BASE_URL`	Custom DashScope endpoint
`REPLICATE_BASE_URL`	Custom Replicate endpoint

Load Priority: CLI args > EXTEND.md > env vars > /.baoyu-skills/.env > ~/.baoyu-skills/.env

Replicate Model Configuration

When using --provider replicate, the model can be configured in the following ways (highest priority first):

CLI flag: --model
EXTEND.md: default_model.replicate
Env var: REPLICATE_IMAGE_MODEL
Built-in default: google/nano-banana-pro

Supported model formats:

owner/name (recommended for official models), e.g. google/nano-banana-pro
owner/name:version (community models by version), e.g. stability-ai/sdxl:

Examples:

# Use Replicate default model
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate

# Override model explicitly
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana

Provider Selection

--ref provided + no --provider → auto-select Google first, then OpenAI, then Replicate
--provider specified → use it (if --ref, must be google, openai, or replicate)
Only one API key available → use that provider
Multiple available → default to Google

Quality Presets

Preset	Google imageSize	OpenAI Size	Use Case
`normal`	1K	1024px	Quick previews
`2k` (default)	2K	2048px	Covers, illustrations, infographics

Google imageSize: Can be overridden with --imageSize 1K|2K|4K

Aspect Ratios

Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1

Google multimodal: uses imageConfig.aspectRatio
Google Imagen: uses aspectRatio parameter
OpenAI: maps to closest supported size

Generation Mode

Default: Sequential generation (one image at a time). This ensures stable output and easier debugging.

Parallel Generation: Only use when user explicitly requests parallel/concurrent generation.

Mode	When to Use
Sequential (default)	Normal usage, single images, small batches
Parallel	User explicitly requests, large batches (10+)

Parallel Settings (when requested):

Setting	Value
Recommended concurrency	4 subagents
Max concurrency	8 subagents
Use case	Large batch generation when user requests parallel

Agent Implementation (parallel mode only):

# Launch multiple generations in parallel using Task tool
# Each Task runs as background subagent with run_in_background=true
# Collect results via TaskOutput when all complete

Error Handling

Missing API key → error with setup instructions
Generation failure → auto-retry once
Invalid aspect ratio → warning, proceed with default
Reference images with unsupported provider/model → error with fix hint (switch to Google multimodal: gemini-3-pro-image-preview, gemini-3.1-flash-image-preview; or OpenAI GPT Image edits)

Extension Support

Custom configurations via EXTEND.md. See Preferences section for paths and supported options.

上一篇：ClawBot 网络：分布式多设备智能体协作 - Openclaw Skills 下一篇：多智能体开发团队：协作式 AI 软件开发 - Openclaw Skills