Skip to main content
roveflow init installs a /roveflow slash command into your Claude Code project. Invoke it in any Claude Code session whose working directory is the Flutter project root.

Prerequisites

  • The app is running in debug mode on a simulator.
  • .mcp.json points at a working flutter_inspector_mcp binary.
  • docs/roveflow/scenarios.md has a valid cold-setup and at least the scenarios you want to run.

Run configuration

Before the orchestrator relaunches the app, it resolves three values:
ValueHow it’s resolved
flutter_commandfvm flutter if .fvmrc / .fvm/ / fvm_config.json is present; otherwise flutter.
flavorThe single subdirectory of ios/config/ if exactly one exists; otherwise omitted.
vm_service_portParsed from --dart-vm-port=N in .mcp.json; defaults to 8181.
To override any of these, drop a docs/roveflow/config.yaml (all fields optional):
flutter_command: fvm flutter
flavor: production
vm_service_port: 8181
Use it when auto-detect picks the wrong flavor, when your repo has a non-standard Flutter wrapper, or when you need a non-default port (keep .mcp.json --dart-vm-port in sync).

Flags

Two independent axes plus an override.

--mode=smoke|rove|all

Default smoke. Picks which kind of scenario to run. rove walks the app and emits a screen graph; all runs both.

--criticality=critical|all

Default critical. Only applies to smoke scenarios. Set to all for nightly runs that include criticality: extended scenarios.

--only=<ids>

Comma-separated scenario ids. Bypasses --mode and --criticality. Cold-setup auto-prepends when needed.

--no-record

Skip the full-run .mov capture. Useful for CI or headless sims.

Invocation matrix

CommandRuns
/roveflowcold-setup + smoke critical
/roveflow --criticality=allcold-setup + all smoke
/roveflow --mode=rovecold-setup + all rove
/roveflow --mode=allcold-setup + all smoke + all rove
/roveflow --only=rove-maincold-setup + rove-main

Recording the simulator

Every run also captures a full-length .mov of the simulator via xcrun simctl io booted recordVideo. It lands at docs/roveflow/runs/<run-id>/recording.mov alongside report.md. Pass --no-record to skip it:
/roveflow --no-record
Good reasons to skip: no simulator GUI is visible, CI runs on a headless sim, or you just want tighter iteration loops.

Invocation

In Claude Code:
/roveflow
Default: smoke scenarios tagged critical. Pre-release gate.
/roveflow --criticality=all
Every smoke scenario regardless of criticality.
/roveflow --mode=rove
Cold-setup plus every rove scenario. Produces docs/roveflow/maps/<app>/.
/roveflow --mode=all
Everything.
/roveflow --only=cold-setup,open-detail
Explicit subset. Useful while iterating on a single scenario.

What happens under the hood

  1. The orchestrator parses scenarios and filters by mode (and --only if present).
  2. It resets the simulator (shut down, erase, boot, relaunch the app in debug mode). That’s the only step that touches your system directly.
  3. It dispatches a roveflow-runner sub-agent (Sonnet) for cold-setup.
  4. If cold-setup passes, it dispatches one agent per remaining scenario, in declared order, collecting a structured result for each.
  5. Between scenarios it runs reset_to_home; if that fails it has one fallback attempt before aborting with skipped: setup_lost.
  6. It compacts working memory after every 5 scenarios or when the context estimate crosses 60%.
  7. It compiles the results into a markdown report.
See the MCP tools reference for the tools the agent calls, and the Scenario schema for what each run writes to disk.

The report

Manual mode

A markdown table to chat:
| scenario               | result | waypoints hit | screenshots | notes  |
|------------------------|--------|---------------|-------------|--------|
| cold-setup             | pass   | 1/1           | 2           |        |
| open-detail            | fail   | 1/2           | 3           | ...    |
Followed by a numbered list of failures and a prompt asking which to file.

Scheduled mode

Same table written to docs/roveflow/runs/YYYY-MM-DD-HHMM/report.md with no chat interaction.

Reading results

  • pass — the scenario’s pass condition was observable on screen.
  • fail — the scenario’s fail condition fired, or the flow was unreachable (crash, blank screen, unrecoverable error toast).
  • skipped: no_data — environmental gap (e.g., no available slots today).
  • skipped: setup_failed — cold-setup didn’t reach home; every later scenario inherits this.
  • skipped: setup_lost — mid-run reset failed; remaining scenarios inherit it.
A scenario that uses tap_at as a fallback surfaces "fragile": true in its notes. If you see fragility on a hot scenario, consider adding a ValueKey (Using ValueKeys).

Aborts

The orchestrator aborts when:
  • The app fails to boot or the VM service never responds.
  • cold-setup fails.
  • Mid-run reset-to-home fails and the one fallback attempt also fails.
On abort, remaining scenarios are marked skipped: setup_failed or skipped: setup_lost in the report.

Next

Authoring scenarios

Add more flows to the inventory.

Recording demos

Save polished playbacks of your runs.