Overview
UseBrowserEnv with the Prime CLI to evaluate browser agents on structured tasks. Each evaluation run spins up Browserbase sessions, feeds observations to your model, and collects reward signals — giving you reproducible benchmarks for browser-capable models.

Prerequisites
Browserbase Account
API key and project ID from your Browserbase dashboard
Prime CLI
Install via
uv add primeverifiers
Install with browser extras:
uv add verifiers[browser]Install and Configure
Set Browserbase Credentials
Export your Browserbase credentials soBrowserEnv can create sessions:
Install the Prime CLI
Install verifiers with Browser Support
Choose a BrowserEnv Mode
BrowserEnv supports two observation/action modes. The mode is selected when you run an evaluation — either through the environment’s default or via -a args.
DOM Mode (Recommended)
The agent receives structured DOM content and issues natural language instructions via Stagehand tools (navigate, observe, act, extract). This is the default and works well for most browser tasks.
CUA Mode
The agent receives screenshots and uses coordinate-based tool calls (click, type_text, scroll, screenshot). Use this for vision models trained on screenshot-grounded interaction.
CUA mode deploys a sandbox server by default to handle connection to Browserbase’s custom CDP driver, Understudy, which overcomes performance limitations of Playwright. You can also run against a local server with
-a '{"use_sandbox": false}'. See Operational Notes below.Run an Evaluation
Install a Hub Environment
Install a published Browserbase environment from the Prime hub:Run with Default Settings

Override Evaluation Parameters
Control the number of examples, rollouts, and environment-specific args:| Flag | Short | Description |
|---|---|---|
--model | -m | Model to evaluate (e.g. openai/gpt-4.1, anthropic/claude-opus-4.5) |
--api-key-var | -k | Environment variable name for the model API key |
--num-examples | -n | Number of task examples to evaluate |
--rollouts-per-example | -r | Rollouts per example |
--env-args | -a | JSON args passed to the environment’s load_environment() |
--max-concurrent | -c | Max concurrent requests |
--save-results | -s | Save results to disk |
Pass Environment Args
Use-a to pass JSON arguments to the environment. These are forwarded to the load_environment() function:
Run a Published Benchmark
Browserbase publishes browser benchmarks on the Prime hub:Run from a Local Environment
If your environment lives in a local directory:Operational Notes
CUA Mode: Sandbox vs Local Server
CUA Mode: Sandbox vs Local Server
By default, CUA mode deploys a sandbox server using a pre-built Docker image (
deepdream19/cua-server:latest) that exposes Browserbase’s CDP framework, Understudy. This is the recommended setup.For local development, you can run the CUA server yourself and disable the sandbox:Browserbase Proxies and Stealth
Browserbase Proxies and Stealth
Enable Proxies and Stealth Mode via environment args:These are passed through to Browserbase session creation.
Environment Variables
Environment Variables
DOM mode requires:
BROWSERBASE_API_KEY— Browserbase API keyBROWSERBASE_PROJECT_ID— Browserbase project IDMODEL_API_KEY— API key for Stagehand’s underlying model
BROWSERBASE_API_KEY— Browserbase API keyBROWSERBASE_PROJECT_ID— Browserbase project IDPRIME_API_KEY— Required when using sandbox mode (default). Set viaprime loginor as an env var.
OPENAI_API_KEY— Forwarded into the sandbox container if set
Related Resources
Prime Intellect Evaluating Docs
Full documentation on Prime’s evaluation workflow
Prime verifiers Environments
Source code and docs for verifiers environments
Browserbase Getting Started
Core Browserbase documentation
RL Training Guide
Wire BrowserEnv into Prime RL training workflows