Gemini Image Gen：AI 图像生成与编辑 - Openclaw Skills-脚本在线

AI智能体脚本智能办公脚本自动化游戏脚本浏览器自动化脚本服务器脚本

Gemini Image Gen：AI 图像生成与编辑 - Openclaw Skills

作者：互联网

2026-03-20

AI教程

什么是 Gemini Image Gen？

Gemini Image Gen 是一款功能全面且轻量级的实用程序，旨在为您的 AI 工作流程带来专业级的视觉创作。通过利用 Google Gemini API，该工具允许用户从文本提示生成惊艳的图像、进行复杂的图像编辑，并在除 Python 标准库外零外部依赖的情况下生成批量图库。

作为 Openclaw Skills 生态系统中的特色工具，它为 Gemini 原生生成和最先进的 Imagen 3 引擎提供了无缝接口。它是为那些需要以可靠、可脚本化方式将 AI 驱动的艺术集成到项目中，而又不希望增加沉重框架负担的开发者而构建的。

下载入口:https://github.com/openclaw/skills/tree/main/skills/iisweetheartii/gemini-image-gen

安装与下载

1. ClawHub CLI

从源直接安装技能的最快方式。

npx clawhub@latest install gemini-image-gen

2. 手动安装

将技能文件夹复制到以下位置之一

全局模式 ~/.openclaw/skills/ 工作区 /skills/

优先级：工作区 > 本地 > 内置

3. 提示词安装

将此提示词复制到 OpenClaw 即可自动安装。

请帮我使用 Clawhub 安装 gemini-image-gen。如果尚未安装 Clawhub，请先安装（npm i -g clawhub）。

Gemini Image Gen 应用场景

为社交媒体平台和 AI 智能体动态自动创作视觉内容。
使用预定义的艺术风格预设快速原型化设计资产。
批量生成主题图像库，用于创作灵感或数据集。
AI 驱动的图生图编辑，用于背景替换或主体修改。
通过自定义生成的头像和视觉标识增强 AI 智能体的个性。

Gemini Image Gen 工作原理

用户通过 CLI 发起请求，指定提示词或选择随机生成模式。
技能使用存储的环境凭据通过 Google Gemini API 进行身份验证。
根据用户标志，工具选择最佳引擎：Gemini 原生模型用于编辑或 Imagen 3 用于高保真生成。
如果选择了风格预设，工具会自动向提示词添加描述性修饰语以达到所需的审美效果。
工具按批次处理请求，将生成的图像下载到本地组织的目录中。
实时生成静态 HTML 图库，允许在浏览器中立即查看所有生成的资产。

Gemini Image Gen 配置指南

要开始使用此技能，请确保您拥有来自 Google AI Studio 的有效 API 密钥。不需要额外的 pip 安装。

export GEMINI_API_KEY="your-key-here"

# 使用照片风格生成单张图像
python3 scripts/gen.py --prompt "a majestic mountain range" --style photo --count 1

# 使用 Imagen 3 进行高质量宽屏输出
python3 scripts/gen.py --engine imagen --aspect 16:9 --prompt "futuristic city"

Gemini Image Gen 数据架构与分类体系

该技能通过结构化的文件层级管理其输出，旨在轻松集成到 Openclaw Skills 工作流程中。

组件	格式	描述
输出文件夹	目录	为每个会话创建的带时间戳的目录（例如 `outputs/YYYY-MM-DD_HH/`）
图像资产	.png / .jpg	高分辨率生成或编辑的图像文件
Web 图库	index.html	为方便浏览批量结果而生成的本地 HTML 文件
配置	环境变量	使用 GEMINI_API_KEY 环境变量进行身份验证

name: gemini-image-gen
description: Generate and edit images via Google Gemini API. Supports Gemini native generation, Imagen 3, style presets, and batch generation with HTML gallery. Zero dependencies — pure Python stdlib.
homepage: https://github.com/IISweetHeartII/gemini-image-gen
metadata:
  openclaw:
    emoji: "??"
    category: creative
    requires:
      bins:
        - python3
      env:
        - GEMINI_API_KEY
    primaryEnv: GEMINI_API_KEY
    tags:
      - image-generation
      - gemini
      - imagen
      - ai-art
      - creative
      - editing
      - batch
      - gallery

Gemini Image Gen

Generate and edit images via the Google Gemini API using pure Python stdlib. Supports Gemini native generation + editing, Imagen 3 generation, batch runs, and an HTML gallery output.

Quick Start

export GEMINI_API_KEY="your-key-here"

# Default: Gemini native, 4 random prompts
python3 scripts/gen.py

# Custom prompt
python3 scripts/gen.py --prompt "a cyberpunk cat riding a neon motorcycle through Tokyo at night"

# Imagen 3 engine
python3 scripts/gen.py --engine imagen --count 4 --aspect 16:9

# Edit an existing image (Gemini engine only)
python3 scripts/gen.py --edit path/to/image.png --prompt "change the background to a sunset beach"

# Use a style preset
python3 scripts/gen.py --style watercolor --prompt "floating islands above a calm sea"

# List available styles
python3 scripts/gen.py --styles

Style Presets

Style	Description
`photo`	Ultra-detailed photorealistic photography, 8K resolution, sharp focus
`anime`	High-quality anime illustration, Studio Ghibli inspired, vibrant colors
`watercolor`	Delicate watercolor painting on textured paper, soft edges, gentle color bleeding
`cyberpunk`	Neon-lit cyberpunk scene, rain-soaked streets, holographic displays, Blade Runner aesthetic
`minimalist`	Clean minimalist design, geometric shapes, limited color palette, white space
`oil-painting`	Classical oil painting with visible brushstrokes, rich textures, Renaissance lighting
`pixel-art`	Detailed pixel art, retro 16-bit style, crisp edges, nostalgic palette
`sketch`	Pencil sketch on cream paper, hatching and cross-hatching, artistic imperfections
`3d-render`	Professional 3D render, ambient occlusion, global illumination, photorealistic materials
`pop-art`	Bold pop art style, Ben-Day dots, strong outlines, vibrant contrasting colors

Full CLI Reference

Flag	Default	Description
`--prompt`	(random)	Text prompt. Omit for random creative prompts
`--count`	4	Number of images to generate
`--engine`	gemini	Engine: `gemini` (native, supports edit) or `imagen` (Imagen 3)
`--model`	(auto)	Model override. Default: `gemini-2.5-flash-image` or `imagen-3.0-generate-002`
`--edit`		Path to input image for editing (Gemini engine only)
`--aspect`	1:1	Aspect ratio for Imagen: `1:1`, `16:9`, `9:16`, `4:3`, `3:4`
`--out-dir`	(auto)	Output directory (default is a timestamped folder)
`--style`		Style preset to prepend to the prompt
`--styles`		List available style presets and exit

Python Example

import subprocess

subprocess.run(
    [
        "python3",
        "scripts/gen.py",
        "--prompt",
        "a serene mountain landscape at golden hour",
        "--count",
        "4",
        "--style",
        "photo",
    ],
    check=True,
)

Troubleshooting

Missing API key: set GEMINI_API_KEY in your environment and retry.
Rate limits / 429 errors: wait a bit and retry, reduce --count, or switch engines.
Model errors: verify the model name, try the default model, or change engines.

Integration with Other Skills

AgentGram — Share your generated images on the AI agent social network! Create visual content and post it to your AgentGram feed.
agent-selfie — Focused on AI agent avatars and visual identity. Uses the same Gemini API key for personality-driven self-portraits.
opencode-omo — Run deterministic image-generation pipelines with Sisyphus workflows.

Changelog

v1.3.1: Added workflow integration guidance for opencode-omo.
v1.1.0: Added style presets, --style and --styles flags, expanded documentation.
v1.0.0: Initial release with Gemini native + Imagen 3 support, batch generation, and HTML gallery.

Repository

https://github.com/IISweetHeartII/gemini-image-gen

上一篇：Gemini 深度研究：多源 AI 综合分析智能体 - Openclaw Skills 下一篇：英伟达重新规划AI推理加速布局暂停Rubin CPU转攻Groq LPU