mirroir:通过 macOS 控制并自动化 iPhone - Openclaw Skills

作者:互联网

2026-03-30

AI教程

什么是 mirroir?

mirroir 是一款先进的 MCP 服务端,旨在通过原生的 macOS iPhone 镜像应用程序实现与真实 iPhone 的无缝交互。通过将这些功能集成到 Openclaw Skills 中,开发者和高级用户可以执行高保真自动化任务,如屏幕点击、滑动、文本输入和应用启动,而无需访问源代码或越狱设备。它充当了桌面环境与 iOS 之间的桥梁,为以移动设备为中心的工作流提供了一套强大的工具。

该技能在连接视觉与行动方面表现卓越。它利用先进的 OCR 来“读取”屏幕,并利用 Karabiner-Elements 模拟硬件级输入。这使其成为需要在手机上执行任务(如查看 Waze 预计到达时间、发送 iMessage 或测试 Expo Go 应用)同时又专注于 macOS 工作站的用户的理想选择。

下载入口:https://github.com/openclaw/skills/tree/main/skills/jfarcand/mirroir

安装与下载

1. ClawHub CLI

从源直接安装技能的最快方式。

npx clawhub@latest install mirroir

2. 手动安装

将技能文件夹复制到以下位置之一

全局模式 ~/.openclaw/skills/ 工作区 /skills/

优先级:工作区 > 本地 > 内置

3. 提示词安装

将此提示词复制到 OpenClaw 即可自动安装。

请帮我使用 Clawhub 安装 mirroir。如果尚未安装 Clawhub,请先安装(npm i -g clawhub)。

mirroir 应用场景

  • 为使用 TestFlight、Expo Go 或零售版 App Store 应用的开发者提供自动化移动应用测试。
  • 通过自动化的屏幕录制和截图创建视觉错误报告及文档。
  • 直接从终端或 AI 助手管理仅限 iPhone 的通知和即时通讯应用(如 WhatsApp、信息)。
  • 执行复杂的、多步骤的 iOS 工作流,如账户设置、导航设置或在移动优先的应用中输入数据。
  • 通过程序化切换 Wi-Fi、蜂窝网络和飞行模式来测试依赖网络的功能。
mirroir 工作原理
  1. 工作流从状态检查开始,以验证 iPhone 镜像会话是否处于活动状态且设备是否有响应。
  2. 使用 OCR 生成屏幕描述,识别文本元素和按钮的精确坐标,通常包括一张带有网格覆盖的截图以供视觉参考。
  3. 根据识别出的元素,该技能执行输入命令,如点击特定的 x/y 坐标、在点之间滑动或通过虚拟键盘驱动程序输入文本。
  4. 对于复杂的导航,该技能可以通过 Spotlight 启动应用,或在列表中滚动直到特定的语义标签变得可见。
  5. 每个序列都可以进行录制,或通过最终截图进行验证,以确保自动化达到了目标。

mirroir 配置指南

要将此功能集成到您的 Openclaw Skills 库中,请使用一键安装程序,它会同时配置辅助守护进程和必要的 Karabiner 驱动程序:

/bin/bash -c "$(curl -fsSL https://mirroir.dev/get-mirroir.sh)"

或者,通过 Homebrew 或 npx 进行手动配置安装:

# Homebrew 安装
brew tap jfarcand/tap && brew install iphone-mirroir-mcp

# npx 安装
npx -y iphone-mirroir-mcp install

安装后,请确保在系统设置中授予“屏幕录制”和“辅助功能”权限,并启用 Karabiner-Elements 扩展。

mirroir 数据架构与分类体系

该技能通过视觉和行为资产的结构化分类来管理设备交互数据:

数据类型 描述 格式
视觉数据 坐标映射的 OCR 文本和 UI 元素位置 JSON
视觉资产 镜像窗口的高分辨率捕捉 PNG
媒体日志 自动化会话工作流的视频录制 MOV
自动化逻辑 基于语义意图的情景定义 YAML
设备状态 连接状态、方向和窗口几何参数 元数据
name: mirroir
description: Control a real iPhone through macOS iPhone Mirroring — screenshot, tap, swipe, type, launch apps, record video, OCR, and run multi-step scenarios. Works with any app on screen, no source code or jailbreak required.
homepage: https://mirroir.dev
metadata:
  {
    "openclaw":
      {
        "emoji": "??",
        "os": ["darwin"],
        "requires": { "bins": ["iphone-mirroir-mcp"] },
        "install":
          [
            {
              "id": "brew",
              "kind": "brew",
              "formula": "jfarcand/tap/iphone-mirroir-mcp",
              "bins": ["iphone-mirroir-mcp"],
              "label": "Install mirroir via Homebrew",
            },
            {
              "id": "node",
              "kind": "node",
              "package": "iphone-mirroir-mcp",
              "bins": ["iphone-mirroir-mcp"],
              "label": "Install mirroir (npx)",
            },
          ],
      },
  }

Mirroir — iPhone Control via iPhone Mirroring

Use mirroir to control a real iPhone through macOS iPhone Mirroring. Screenshot, tap, swipe, type, launch apps, record video, OCR the screen, and run multi-step automation scenarios — all from the terminal. Works with any app on screen, no source code or jailbreak required.

When to Use

? USE this skill when:

  • User wants to interact with their iPhone (tap, swipe, type, navigate)
  • Sending an iMessage, WhatsApp, or any messaging app message on the iPhone
  • Adding calendar events, reminders, or notes on the iPhone
  • Testing a mobile app (Expo Go, TestFlight, App Store apps)
  • Taking a screenshot of the iPhone screen
  • Recording a video of an iPhone interaction
  • Reading what's on the iPhone screen (OCR)
  • Automating a multi-step iPhone workflow (login flows, app navigation)
  • Checking iPhone settings or toggling network modes
  • Launching an app on the iPhone
  • User says "on my phone", "on my iPhone", "on iOS"

When NOT to Use

? DON'T use this skill when:

  • User wants to send iMessage from macOS Messages.app → use imsg skill
  • User wants to manage Apple Reminders on macOS → use apple-reminders skill
  • User wants to manage Apple Notes on macOS → use apple-notes skill
  • User wants to automate macOS UI → use peekaboo skill
  • User wants to control a camera → use camsnap skill
  • The task can be done entirely on macOS without the iPhone
  • iPhone Mirroring is not connected (check with mirroir status first)

Requirements

  • macOS 15+ (Sequoia or later)
  • iPhone connected via iPhone Mirroring
  • Karabiner-Elements (installed automatically by the mirroir installer)
  • Screen Recording and Accessibility permissions granted

Setup

After installing, run the setup to configure the helper daemon and Karabiner:

# One-line install (recommended)
/bin/bash -c "$(curl -fsSL https://mirroir.dev/get-mirroir.sh)"

# Or via Homebrew
brew tap jfarcand/tap && brew install iphone-mirroir-mcp

# Or via npx
npx -y iphone-mirroir-mcp install

Approve the Karabiner DriverKit extension if prompted: System Settings > General > Login Items & Extensions — enable all toggles under Karabiner-Elements.

MCP Server Configuration

Mirroir is an MCP server. Configure it in your OpenClaw MCP settings:

{
  "mirroir": {
    "command": "npx",
    "args": ["-y", "iphone-mirroir-mcp"]
  }
}

Or if installed via Homebrew, use the binary path directly:

{
  "mirroir": {
    "command": "iphone-mirroir-mcp"
  }
}

Core Workflow

The typical workflow for any iPhone task:

  1. Check status: mirroir status — verify iPhone Mirroring is connected
  2. See the screen: mirroir describe_screen — OCR the screen to find tap targets
  3. Act: tap, swipe, type, launch apps based on what's visible
  4. Verify: take a screenshot or describe the screen again to confirm

Available Tools (26 total)

Screen & Vision

  • screenshot — Capture the iPhone screen as PNG
  • describe_screen — OCR the screen, returns text elements with exact tap coordinates plus a grid-overlaid screenshot
  • get_orientation — Report portrait/landscape and window dimensions
  • status — Connection state, window geometry, device readiness
  • check_health — Full diagnostic: mirroring, helper, Karabiner, screen capture

Input

  • tap x y — Tap at coordinates
  • double_tap x y — Two rapid taps (zoom, text selection)
  • long_press x y — Hold tap for context menus (default 500ms)
  • swipe from_x from_y to_x to_y — Swipe between two points
  • drag from_x from_y to_x to_y — Slow drag for icons, sliders
  • type_text "text" — Type text via Karabiner virtual keyboard
  • press_key key [modifiers] — Send special keys (return, escape, tab, arrows) with optional modifiers (command, shift, option, control)
  • shake — Trigger shake gesture (Ctrl+Cmd+Z) for undo/dev menus
  • launch_app "AppName" — Open app via Spotlight search
  • open_url "https://..." — Open URL in Safari
  • press_home — Go to home screen
  • press_app_switcher — Open app switcher
  • spotlight — Open Spotlight search
  • scroll_to "label" — Scroll until a text element becomes visible via OCR
  • reset_app "AppName" — Force-quit app via App Switcher

Recording & Measurement

  • start_recording — Start video recording of the mirrored screen
  • stop_recording — Stop recording and return the .mov file path
  • measure action until [max_seconds] — Time a screen transition

Network & Scenarios

  • set_network mode — Toggle airplane/Wi-Fi/cellular via Settings
  • list_scenarios — List available YAML automation scenarios
  • get_scenario "name" — Read a scenario file

Coordinates

Coordinates are in points relative to the mirroring window's top-left corner. Always use describe_screen first to get exact tap coordinates via OCR. The grid overlay helps target unlabeled icons (back arrows, gears, stars).

Examples

Send an iMessage on iPhone

1. launch_app "Messages"
2. describe_screen → find "New Message" button coordinates
3. tap [x] [y] on "New Message"
4. type_text "Alice"
5. describe_screen → find Alice in suggestions
6. tap [x] [y] on Alice
7. tap [x] [y] on the message field
8. type_text "Running 10 min late"
9. press_key return
10. screenshot → confirm sent

Test a login flow

1. launch_app "MyApp"
2. describe_screen → find Email field
3. tap [x] [y] on Email
4. type_text "${TEST_EMAIL}"
5. tap [x] [y] on Password
6. type_text "${TEST_PASSWORD}"
7. tap [x] [y] on "Sign In"
8. describe_screen → verify "Welcome" appears

Running late — check Waze ETA and notify the team on Slack

1. launch_app "Waze"
2. describe_screen → read ETA to current destination (e.g. "23 min")
3. press_home
4. launch_app "Slack"
5. describe_screen → find target channel
6. tap [x] [y] on "#standup"
7. tap [x] [y] on message field
8. type_text "Heads up — Waze says 23 min out, be there by 9:25"
9. press_key return
10. screenshot → confirm sent

Record a bug reproduction

1. start_recording
2. launch_app "Settings"
3. scroll_to "General"
4. tap [x] [y] on "General"
5. scroll_to "About"
6. tap [x] [y] on "About"
7. stop_recording → returns path to .mov file

Scenarios (YAML Automation)

Mirroir supports YAML scenario files for multi-step automation flows. Scenarios describe intents, not coordinates — the AI reads the steps and executes them using the MCP tools above, adapting to what's actually on screen.

name: Expo Go Login Flow
app: Expo Go
description: Test the login screen of an Expo Go app with valid credentials

steps:
  - launch: "Expo Go"
  - wait_for: "${APP_SCREEN:-LoginDemo}"
  - tap: "${APP_SCREEN:-LoginDemo}"
  - wait_for: "Email"
  - tap: "Email"
  - type: "${TEST_EMAIL}"
  - tap: "Password"
  - type: "${TEST_PASSWORD}"
  - tap: "Sign In"
  - assert_visible: "Welcome"
  - screenshot: "login_success"

The step labels (launch, wait_for, tap, type, assert_visible, screenshot) are semantic intents — the AI interprets each one and calls the appropriate MCP tools (launch_app, describe_screen, tap, type_text, screenshot, etc.) to carry them out.

Use list_scenarios to discover available scenarios and get_scenario to load them.

Tips

  • Always call describe_screen before tapping — never guess coordinates.
  • Use scroll_to "label" to find off-screen elements instead of manual swiping.
  • After typing, iOS autocorrect may alter text — type carefully or disable autocorrect on the iPhone.
  • Use reset_app before launch_app to ensure a fresh app state.
  • For keyboard shortcuts in apps, use press_key with modifiers (e.g., press_key n [command] for new message in Mail).
  • describe_screen with skip_ocr: true returns only the grid screenshot, letting your vision model identify icons and images OCR can't read.

Troubleshooting

  • "iPhone Mirroring not found" → Open iPhone Mirroring.app manually, ensure your iPhone is paired
  • Taps not registering → Check Karabiner DriverKit extension is approved in System Settings
  • Screenshot permission denied → Grant Screen Recording permission to your terminal
  • Helper not running → Run npx iphone-mirroir-mcp setup to reinstall the helper daemon