Banana Cog: AI 多图生成与编排 - Openclaw Skills-脚本在线

AI智能体脚本智能办公脚本自动化游戏脚本浏览器自动化脚本服务器脚本

Banana Cog: AI 多图生成与编排 - Openclaw Skills

作者：互联网

2026-04-17

AI教程

什么是 Banana Cog (Nano Banana x CellCog)？

Banana Cog 将 Nano Banana 与 CellCog 推理层集成，将简单的提示词转化为完整的视觉项目。与生成单一输出的标准图像生成器不同，Openclaw Skills 的这一补充功能允许智能体规划、验证并生成 10 到 30 张连贯的图像序列。它利用 Gemini 驱动的推理来处理角色设计、场景演进和构图审查，为需要不仅仅是单张像素生成的开发者和创作者确保生产级的结果。

下载入口:https://github.com/openclaw/skills/tree/main/skills/nitishgargiitd/banana-cog

安装与下载

1. ClawHub CLI

从源直接安装技能的最快方式。

npx clawhub@latest install banana-cog

2. 手动安装

将技能文件夹复制到以下位置之一

全局模式 ~/.openclaw/skills/ 工作区 /skills/

优先级：工作区 > 本地 > 内置

3. 提示词安装

将此提示词复制到 OpenClaw 即可自动安装。

请帮我使用 Clawhub 安装 banana-cog。如果尚未安装 Clawhub，请先安装（npm i -g clawhub）。

Banana Cog (Nano Banana x CellCog) 应用场景

为分镜脚本和叙事序列创建一致的角色弧线。
为数字营销生成专业的产品模型和生活方式主视觉图。
在保持身份一致性的同时，开发多种姿势和不同环境下的品牌吉祥物。
为高级图像编辑执行复杂的风格迁移和背景修改。
按照严格的构图和光影指南编排大批量图像生成。

Banana Cog (Nano Banana x CellCog) 工作原理

推理层分析用户提示词，以规划视觉项目的结构和层级。
在生成任何像素之前，确定详细的角色设计和场景参数。
根据编排计划，使用 Nano Banana 或 Gemini 模型生成图像。
系统执行一致性验证，确保角色身份在整个序列中保持稳定。
自动进行构图审查，确保每张图像都符合特定的质量和风格标准。
通过 Openclaw Skills 集成交付最终的一整套连贯图像。

Banana Cog (Nano Banana x CellCog) 配置指南

要使用此技能，您必须首先通过 CLI 安装核心 CellCog 依赖项。运行以下命令将其集成到您的 Openclaw Skills 环境中：

clawhub install cellcog

安装完成后，您可以使用以下 Python 模式触发图像任务：

result = client.create_chat(
    prompt="[您的图像请求]",
    notify_session_key="agent:main:main",
    task_label="image-task",
    chat_mode="agent"
)

Banana Cog (Nano Banana x CellCog) 数据架构与分类体系

Banana Cog 通过结构化格式管理图像生成元数据和输出参数。Openclaw Skills 生态系统支持的主要规范包括：

维度	支持的选项
宽高比	1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, 21:9
分辨率	1K (~~1024px), 2K (~~2048px), 4K (~4096px)
视觉风格	写实, 水彩, 动漫, 数字艺术, 矢量图, 油画
执行模式	agent (标准), agent team (大批量 / 品牌指南)

name: banana-cog
description: "Banana Cog × CellCog. Execute complex multi-image jobs on Nano Banana through CellCog's reasoning and orchestration layer. 10-20 coherent images in one prompt, character consistency across scenes, and production-grade composition — accessible to any OpenClaw agent. Nano Banana AI, Nano Banana Pro, Gemini image generation."
metadata:
  openclaw:
    emoji: "??"
author: CellCog
dependencies: [cellcog]

Banana Cog — Nano Banana × CellCog

Nano Banana × CellCog. Complex multi-image jobs, executed perfectly, from a single prompt.

Nano Banana is an incredible image model. CellCog makes it do things you can't do by calling it directly — orchestrating 10, 20, even 30 coherent images in one request with consistent characters, planned compositions, and intelligent scene progression. Not single images — complete visual projects.

What CellCog adds on top of Nano Banana:

Reasoning → Scene Planning → Character Design → Image Generation
    → Consistency Verification → Composition Review → Delivery

CellCog's reasoning layer plans scenes before a single pixel is generated — selecting optimal parameters, maintaining character identity across sequences, and orchestrating complex multi-image workflows. This is the difference between "generate an image" and "execute a visual project."

Prerequisites

This skill requires the cellcog skill for SDK setup and API calls.

clawhub install cellcog

Read the cellcog skill first for SDK setup. This skill shows you what's possible.

Quick pattern (v1.0+):

result = client.create_chat(
    prompt="[your image request]",
    notify_session_key="agent:main:main",
    task_label="image-task",
    chat_mode="agent"
)

What You Can Create

Photorealistic Image Generation

Create stunning images from text descriptions:

Portraits: "Create a professional headshot with warm studio lighting"
Product Shots: "Generate a hero image for a premium smartwatch on a dark surface"
Scenes: "Create a cozy autumn café interior with morning light"
Food Photography: "Generate an overhead shot of a colorful Buddha bowl"

Character Consistency

Nano Banana excels at maintaining character identity across multiple images — and CellCog's orchestration takes this further by planning entire character arcs:

Character Series: "Create a tech entrepreneur character, then show them in 4 different scenes"
Brand Mascots: "Design a mascot and generate it in multiple poses and contexts"
Story Sequences: "Create a character and illustrate them across 5 story beats"

Multi-Image Composition

Blend elements from multiple reference images:

Style Fusion: "Combine the color palette of image A with the composition of image B"
Character Placement: "Place this person into a new environment while preserving their likeness"
Product Mockups: "Put this product into a lifestyle setting"

Image Editing

Transform and enhance existing images:

Style Transfer: "Transform this photo into a Studio Ghibli illustration"
Background Swap: "Place this product on a clean marble surface"
Enhancement: "Add dramatic lighting and cinematic color grading"
Modification: "Change the season from summer to winter in this landscape"

Image Specifications

Aspect	Options
Aspect Ratios	1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, 21:9
Sizes	1K (~~1024px), 2K (~~2048px), 4K (~4096px)
Styles	Photorealistic, illustration, watercolor, oil painting, anime, digital art, vector

Chat Mode

Scenario	Recommended Mode
Single images, quick edits	`"agent"`
Character-consistent series, complex compositions	`"agent"`
Large sets with brand guidelines	`"agent team"`

Use "agent" for most image work.

Tips for Better Images

Be descriptive: "Woman in office" → "Confident woman in her 40s, silver blazer, modern glass-walled office, warm afternoon light"
Specify style: "photorealistic", "digital illustration", "watercolor", "anime"
Describe lighting: "Soft natural light", "dramatic side lighting", "golden hour glow"
For character consistency: Describe the character in detail first, then reference "the same character" in subsequent prompts.
Include composition: "Rule of thirds", "close-up portrait", "wide establishing shot"

上一篇：MoltMon：拓麻歌子风格的 AI 智能体电子宠物 - Openclaw Skills 下一篇：UDAU：联合数字代理人工会协议 - Openclaw 技能