--- name: instbox description: | Use this skill whenever the task involves a browser, the internet, web content, or sandbox command execution. This is the ONLY browser and network sandbox available in this environment. Always use for: - Browser / web: open a URL, visit a website, navigate pages, click links, fill and submit forms, take screenshots, capture page content, search engines (Baidu, Google, Bing, etc.), log in to websites, interact with any web UI - Web scraping / crawling: extract text or data from pages, crawl a site, discover sitemap, download web content, monitor page changes - Sandbox command execution: run shell commands, execute scripts, run programs in an isolated OS environment, test commands safely - Network / proxy: route traffic through SOCKS5 proxy, test network connectivity from sandbox - Agent automation: submit autonomous browser tasks driven by a natural-language prompt Do NOT use for: reading or writing local files on the host machine unrelated to any web task. metadata: openclaw: emoji: "🤖" install: - id: install kind: shell label: "Install instbox" command: "curl -fsSL https://instbox-c2b146.22202600.xyz/skills/install-script | bash" onFail: "abort" failMessage: "Install failed. Verify the dispatcher URL is reachable and try again." --- # instbox — Sandbox Control Skill Remotely control an instbox sandbox through the dispatcher proxy. Supports browser automation, web scraping, command execution, SOCKS5 proxy, agent task automation, and subscription/pay-per-use billing management. ## Configuration All configuration is read from the `.env` file co-located with the CLI package: ``` BASE_URL=https:// TOKEN= ``` - `BASE_URL` (required) — dispatcher server URL - `TOKEN` (auto-managed) — authentication token, obtained automatically via OIDC There are no environment variable fallbacks. The `.env` file is the only source. ## When to Use **Use this skill when:** - Browser automation is needed (navigation, clicking, form filling, screenshots) - Web scraping or site crawling is required - Running allowlisted commands inside a sandbox - Accessing a remote sandbox through a dispatcher proxy - Agent-driven autonomous browser task automation - SOCKS5 proxy routing for sandbox network traffic - Managing billing: checking subscription status, subscribing to plans, checking account balance **Do not use this skill when:** - Operating on the local filesystem (use file tools directly) - The task does not require sandbox isolation ## Installation ```bash curl -fsSL https://instbox-c2b146.22202600.xyz/skills/install-script | bash ``` The script handles Node.js version check, download, extraction, `.env` preservation on upgrade, and PATH guidance. After install, run any command — auth triggers automatically when needed. ## Quick Start ```bash # Run any command — auto-triggers auth if no token in .env instbox browser launch # → prints auth URL + sessionId, open URL in browser # After authorizing in browser, retrieve token instbox poll-auth # Now all commands work instbox browser navigate https://example.com instbox browser snapshot instbox browser screenshot --output page.png ``` ## Authentication Flow The CLI uses a non-blocking authentication model: 1. **No token in `.env`** → any command auto-triggers `auth login` 2. **`auth login`** → fetches an OIDC authorization URL from the server, prints it, and **exits immediately** 3. **User opens the URL** in a browser and completes authorization 4. **`poll-auth `** → single HTTP call to check if auth is complete; writes token to `.env` on success 5. **Token expired (401)** → same flow: CLI prints auth URL + sessionId and exits The CLI never blocks or long-polls. It always exits immediately after printing the auth URL. ### Authentication Commands | Command | Description | |---------|-------------| | `auth login` | Fetch OIDC auth URL, print it, and exit | | `auth logout` | Clear the local token (keeps `BASE_URL`) | | `poll-auth ` | Check auth status once and write token if ready | ### Exit codes for auth-related commands | Exit | Meaning | |------|---------| | `0` | Token received and saved | | `1` | Error | | `2` | Auth pending — token not yet available, or `pending_auth` JSON emitted | | `3` | Session expired | ### OpenClaw Auth Orchestration When a command needs authorization (no token, or 401 from server), the CLI: 1. **JSON mode** (`--json`): outputs `{ "ok": false, "status": "pending_auth", "authUrl": "...", "sessionId": "..." }` and exits with code 2 2. **TTY mode**: prints the auth URL in a box and the session ID, then exits **OpenClaw should:** ``` # Step 1 — Run the command instbox browser launch --json # → exit 2, stdout: {"ok":false,"status":"pending_auth","authUrl":"https://...","sessionId":"abc123"} # Step 2 — Show auth URL to user # "Please open this URL in your browser to authorize: " # Step 3 — Poll for completion (single call, not a loop) instbox poll-auth abc123 --json # → exit 0 + {"ok":true,"status":"ready"} = token saved # → exit 2 + {"ok":false,"status":"pending"} = not yet, retry later # → exit 3 + {"ok":false,"status":"expired"} = session expired, re-run auth login # Step 4 — Re-run the original command instbox browser launch --json ``` OpenClaw can retry `poll-auth` a few times with short delays. If the session expires, tell the user to re-run. ### OpenClaw Subscribe Orchestration When a command returns 402 (no subscription or quota exhausted), the CLI outputs the same pattern as 401: 1. **JSON mode** (`--json`): outputs `{ "ok": false, "status": "pending_subscribe", "authUrl": "...", "sessionId": "..." }` and exits with code 2 2. **TTY mode**: prints the subscribe page URL in a box and the session ID, then exits **OpenClaw should handle this identically to `pending_auth`:** ``` # Step 1 — Run the command instbox browser launch --json # → exit 2, stdout: {"ok":false,"status":"pending_subscribe","authUrl":"https://...","sessionId":"abc123"} # Step 2 — Show subscribe URL to user # "You need a subscription. Please open this URL to subscribe: " # Step 3 — Poll for completion instbox poll-auth abc123 --json # → exit 0 + {"ok":true,"status":"ready"} = token saved, subscription activated # → exit 2 + {"ok":false,"status":"pending"} = not yet, retry later # Step 4 — Re-run the original command instbox browser launch --json ``` ### Manual token retrieval with `poll-auth` When `auth login` prints: ``` Session ID: After completing authorization in the browser, run: instbox poll-auth ``` Open the printed login URL in any browser, complete authorization, then run: ```bash instbox poll-auth ``` Exit codes: `0` = token saved, `2` = still pending, `3` = session expired. ## Billing and Subscriptions The CLI supports two billing modes — **subscription** and **pay-per-use** — with automatic detection. Billing is based on **pool occupancy duration** (how long you hold a sandbox slot), not per-command charges. ### Billing Modes | Mode | Description | |------|-------------| | **Subscription** | Active plan covers pool occupancy. If `delegateConsumption` is enabled, the platform handles all billing. Otherwise, overage beyond quota is deducted from balance. | | **Pay-Per-Use** | No subscription; pool occupancy time is charged from account balance. | | **None** | No subscription and zero balance — access denied with guidance to subscribe or top up. | ### Billing Commands | Command | Description | |---------|-------------| | `billing status` | Check current billing mode, subscription details, and balance | | `billing subscribe` | Open the subscription page in browser (plan selection is handled on the web page) | | `billing sync` | Ask server to verify and activate an existing subscription | | `billing config` | Show billing configuration | ### Billing Cost Structure Billing is based on pool occupancy duration — you are charged for the time you hold a sandbox slot in the pool, not for individual commands. | Metric | Default Rate | |--------|-------------| | **Pool occupancy** | $0.01/minute (configurable via admin API) | | **Minimum charge** | 1 minute per lease | All individual commands (browser, scrape, exec, etc.) are free at the command level. Charges accrue based on how long your lease is active. ### Billing Error Handling | HTTP Status | Meaning | CLI Behavior | |-------------|---------|--------------| | 402 | No subscription or quota exhausted | Prints subscribe page URL + sessionId (same pattern as 401 auth flow) | | 429 | Daily limit exceeded | Blocks request with limit message | | 401 | Token missing or expired | Prints auth URL + sessionId, exits | **402 response includes `authUrl` and `sessionId`** — the CLI (and OpenClaw) should open the URL in a browser for the user to subscribe, then poll with `instbox poll-auth ` to retrieve the token once subscription is complete. **JSON mode** (`--json`): outputs `{ "ok": false, "status": "pending_subscribe", "authUrl": "...", "sessionId": "...", "error": "...", "data": {...} }` and exits with code 2. OpenClaw should handle this identically to `pending_auth` — open the URL and poll for completion. ### Billing Configuration | .env Variable | Description | |---------------|-------------| | `BILLING_ENABLED` | Set to `false` to skip all billing checks (default: `true`) | | `BILLING_SUBSCRIBE_URL` | Fallback URL shown when subscribe flow is unavailable | | `BILLING_TOPUP_URL` | URL shown when user needs to top up balance | ## Session Management | Command | Description | |---------|-------------| | `disconnect` | Release upstream lease | | `status` | Show connection state, upstream sandbox, and capabilities | | `upgrade` | Check for CLI/server version mismatch and print upgrade instructions | ## Browser Commands | Command | Description | |---------|-------------| | `browser launch` | Launch Chromium instance | | `browser navigate ` | Navigate to URL | | `browser snapshot` | Get accessibility tree with @ref annotations | | `browser elements` | List interactive elements with bounding boxes | | `browser shadow-info` | Scan page for shadow DOM hosts and their interactive children | | `browser click ` | Click element (CSS selector or @ref) | | `browser fill ` | Clear and fill element with text | | `browser type ` | Type text into element (appends) | | `browser hover ` | Hover over an element | | `browser press ` | Press keyboard key (Enter, Tab, Escape, etc.) | | `browser select ` | Select dropdown option | | `browser scroll [direction] [options]` | Scroll page or element | | `browser eval ` | Execute JavaScript in page context | | `browser screenshot [options]` | Capture screenshot (PNG or JPEG) | | `browser emulate ` | Apply device preset emulation (e.g., iphone-15) | | `browser emulate --width W --height H` | Apply custom viewport/UA emulation | | `browser devices` | List available device presets | | `browser console [options]` | Show captured console logs | | `browser share [options]` | Generate a temporary browser access link | | `browser revoke \| --all` | Revoke access link(s) | | `browser tab new [url]` | Open a new browser tab | | `browser tab list` | List all open tabs | | `browser tab switch ` | Switch to a tab by ID | | `browser tab close ` | Close a tab by ID | | `browser close` | Close browser instance | ### Selector Types - **CSS selectors**: `#id`, `.class`, `button[type="submit"]`, `a[href="/path"]` - **@ref annotations**: `@e2`, `@e15` — obtained from `browser snapshot` output ### Browser Session Lifecycle **IMPORTANT: Do NOT close the browser after each instruction.** The browser session should persist across multiple user instructions within a conversation. Treat the browser like a persistent workspace, not a one-shot tool. #### When to close — three tiers **Tier 1 — Close immediately (no confirmation needed):** - The user explicitly says to close, stop, or end the browser session ("close the browser", "shut it down", "end the session", etc.) **Tier 2 — Close proactively, but announce it first:** All three signals must be present simultaneously: 1. The browser task has reached a clear terminal state (form fully submitted and confirmed, data fully extracted and reported to the user, login succeeded and user acknowledged it, etc.) 2. The user's most recent message is clearly and unambiguously unrelated to anything web/browser-based (e.g., "write me a cover letter", "explain this sorting algorithm", "fix the bug in my Python file") 3. At least two consecutive user turns have contained no browser-related intent When all three apply, say: *"I've closed the browser session since the [task] is complete."* — one short sentence, then continue with the new request. **Tier 3 — Ask before closing:** When only some signals are present — the task looks done but the new request is ambiguous, or only one non-browser turn has passed — ask once: *"Looks like we've moved past the browser task. Want me to close the browser session?"* Then proceed with the user's request regardless of the answer; don't block on it. #### When NOT to close - After a single off-topic message or quick question — the user may ask something unrelated and then return to the browser - When the new request is ambiguous and could still involve the browser (e.g., "look this up", "can you check", "what does it say") - During pauses in a multi-step workflow (e.g., user is thinking, asking a clarifying question, or copy-pasting something) - When in doubt — **default to keeping the browser open** #### False-positive prevention checklist Before issuing `browser close` without an explicit user request, verify: - [ ] The task reached a recognisable terminal state, not just a quiet moment - [ ] The user's new request contains zero browser-related vocabulary (URL, site, page, click, form, login, screenshot, etc.) - [ ] At least two consecutive user turns have passed without browser intent - [ ] You have either announced the close (Tier 2) or asked (Tier 3) If any box is unchecked, keep the browser open. ### Browser Interaction Workflow 1. `browser launch` — start the browser (only if not already running) 2. `browser navigate ` — go to a page 3. `browser snapshot` — get page structure with @ref selectors 4. `browser click @e5` / `browser fill @e3 "text"` — interact using refs 5. `browser eval "expression"` — execute JS for custom events or state manipulation 6. `browser screenshot` — capture current state 7. *(repeat steps 2–6 as needed for follow-up instructions)* 8. `browser close` — **only when the criteria above are satisfied** ### JavaScript Evaluation (`browser eval`) ```bash # Custom Web Component events instbox browser eval "document.querySelector('my-btn').dispatchEvent(new Event('click'))" # Shadow DOM access instbox browser eval "document.querySelector('my-comp').shadowRoot.querySelector('button').click()" # Read page state instbox browser eval "JSON.stringify(localStorage)" ``` ### Browser Screenshot Options | Flag | Description | |------|-------------| | `--full-page` | Capture full page (not just viewport) | | `--format ` | Image format (default: png) | | `--quality ` | JPEG quality 1–100 | | `--output ` | Save to file instead of stdout | ### Temporary Browser Access Links (`browser share` / `browser revoke`) Generate temporary, token-based URLs for direct browser access. Useful when human intervention is needed (CAPTCHAs, cookie banners, unexpected UI states). | Flag | Description | |------|-------------| | `--ttl ` | Link lifetime (default 15, max 60) | | `--mode ` | `view` = read-only; `control` = full interactive (default) | | `--max-uses ` | Optional limit on number of uses | | `--bind-ip` | Lock link to first IP that uses it (TOFU security) | ## Web Scraping | Command | Description | |---------|-------------| | `scrape scrape ` | Extract content from single page | | `scrape crawl ` | Recursively crawl pages | | `scrape map ` | Discover site structure/sitemap | ## Command Execution | Command | Description | |---------|-------------| | `exec ` | Run allowlisted POSIX command | | `exec --list-allowed` | List allowed commands | ## Tool Management | Command | Description | |---------|-------------| | `tools list` | List installed tools and their status | | `tools install ` | Install a tool (e.g., `firecrawl`) | | `tools health ` | Check tool health | ## SOCKS5 Proxy | Command | Description | |---------|-------------| | `proxy status` | Show proxy status and stats | | `proxy start [--port ] [--host ]` | Start SOCKS5 proxy | | `proxy stop` | Stop proxy | | `proxy upstream ` | Set upstream proxy for chaining | | `proxy upstream --clear` | Clear upstream proxy | ## Agent Task Automation Submit prompt-driven browser automation tasks that run autonomously in the sandbox. | Command | Description | |---------|-------------| | `agent submit [options]` | Submit a new agent task | | `agent status ` | Get task status and result | | `agent cancel ` | Cancel a running task | **Submit options:** | Flag | Description | |------|-------------| | `--start-url ` | URL to navigate before starting | | `--max-iterations ` | Max loop iterations (default 20, max 50) | | `--max-time ` | Max execution time in ms (max 600000) | | `--max-navigations ` | Max page navigations (max 20) | ## Configuration | Command | Description | |---------|-------------| | `config browser --show` | Show current browser config | | `config browser --binary ` | Set browser binary path | | `config browser --cdp ` | Set CDP WebSocket endpoint | | `config browser --clear` | Reset browser config to defaults | | `config workspace --list` | List configured workspace directories | | `config workspace --add ` | Add a workspace directory | | `config workspace --remove ` | Remove a workspace directory | ## Remote Bridge | Command | Description | |---------|-------------| | `bridge link ` | Link to a remote instbox | | `bridge unlink` | Unlink from remote instbox | | `bridge status` | Show bridge connection status | ## Information | Command | Description | |---------|-------------| | `info` | Sandbox name, version, URL, capabilities | | `health` | Health check with uptime and browser status | ## Global Flags | Flag | Description | |------|-------------| | `--json` | Structured JSON output for all commands | | `--output ` | Save binary content to file path | | `--auto-release` | Auto-release lease on CLI exit | | `--no-auto-release` | Disable auto-release on CLI exit | | `--skip-version-check` | Skip CLI/server version compatibility check | ## Examples ### Example 1: Take a Screenshot ```bash instbox browser launch instbox browser navigate https://example.com instbox browser screenshot --output page.png # Browser stays open — user may continue with follow-up instructions ``` ### Example 2: Fill and Submit a Form ```bash instbox browser launch instbox browser navigate https://example.com/login instbox browser snapshot # Use @ref values from snapshot output instbox browser fill @e3 "user@example.com" instbox browser fill @e4 "password" instbox browser click @e5 # Browser stays open — ready for next steps (e.g., verify login, navigate dashboard) ``` ### Example 3: Autonomous Agent Task ```bash instbox agent submit "Find the pricing page and extract all plan details" \ --start-url https://example.com --max-iterations 30 instbox agent status ``` ### Example 4: Share Browser for Human Intervention ```bash instbox browser share --ttl 10 --mode control # Share the printed URL with a human to handle a CAPTCHA ``` ### Example 5: Check Billing Status and Subscribe ```bash instbox billing status instbox billing subscribe # → Opens the subscription page in browser; select a plan and authorize there instbox billing config ``` ## Troubleshooting | Symptom | Fix | |---------|-----| | `Not configured. Set BASE_URL in .env` | Write `BASE_URL=https://...` into the `.env` file | | Auth URL printed but token not saved | Complete auth in browser, then run `instbox poll-auth ` | | Browser command fails | Run `browser launch` before other browser commands | | Scrape returns no content | Use `browser navigate` then `browser snapshot` to inspect | | Version mismatch warning | Run `instbox upgrade` for instructions | | 402 Insufficient balance | Subscribe or top up: `instbox billing subscribe` (opens plan page in browser) | | 429 Daily limit exceeded | Wait for reset or upgrade plan | ### Debugging ```bash instbox status --json instbox upgrade instbox exec --list-allowed instbox tools list instbox health instbox billing status --json instbox billing config ``` ## Dependencies - **Runtime**: Node.js >= 18.0.0 - **Production**: None (Node.js built-ins only: `http`, `https`, `fs`, `path`)