The execution loop
When you create a run, Browserbase starts an autonomous agent that works in a loop until the task is done:- Observe. The agent reads the current page, its memory of prior steps, and the available tools.
- Reason. The agent’s model picks the next action: navigate somewhere, click something, extract data, search the web, run a shell command, or declare success.
- Act. The chosen tool executes and its result flows back into the agent’s context.
- Repeat. The loop continues until the agent decides the task is complete or a terminal state is reached.
Built-in tools
The agent has access to four tool groups out of the box. You can’t disable or add to them in the current version.Browser control
The agent uses Stagehand, the SDK for browser agents, to interact with web pages. It navigates to URLs, clicks elements, types into forms, extracts structured data, and waits for page state to settle. Because the agent reasons about each page rather than following a fixed script, it adapts to layout changes and per-site differences that break selectors. This is the primary tool for most runs.Web search
Before opening a browser, the agent can search the web to discover URLs or pull quick context, powered by the Search and Fetch APIs. This lets it find the right starting point rather than guessing.File system
The agent works with real files, not just web pages. It reads and writes files in a sandboxed workspace, processes downloaded content like PDFs, and produces output such as spreadsheets. When the agent triggers a file download, you retrieve the result through the Downloads API.Shell
The agent can run commands in the sandboxed Runtime that backs each run. When data transformation, scripting, or CLI tools are faster than driving a browser, the agent uses the shell instead.Reusable agents and runs
An agent is a reusable configuration; a run is a single execution of a task on a browser session of its own. Reuse is what lets you scale across many targets. Instead of writing and maintaining one script per portal, you create an agent once and run it across hundreds of sites.Create an agent in the dashboard
The fastest way to build an agent is the Agents page in the dashboard. Describe the goal in natural language, set a system prompt and a structured output schema, then trigger runs and inspect each step live. The dashboard also surfaces success rate, average duration, and an Optimize tool that tunes the agent from its past runs.Create an agent over the API
You can also define an agent programmatically with Create an agent:systemPromptgives the agent consistent instructions on every run.resultSchemais a JSON Schema that shapes the agent’s output into typed, repeatable JSON. You can set it on the agent or override it per run.variablespass sensitive or per-run values, such as account numbers, dates of birth, or confirmation codes, as%variable%placeholders the agent fills in without seeing them inline.
agentId. A POST /v1/agents/runs call with no agentId creates a new agent and its first run in a single call, returning both an agentId and a runId.
Run lifecycle
Runs are asynchronous. A run moves through these states after creation:PENDING and RUNNING are active states: the run is either queued or the agent is working. The remaining states are terminal. Once a run reaches one, it won’t change again.
Create a run, poll Get a run until it reaches a terminal state, then read the result.
Configuration and controls
Each run carries production-grade controls. Set supported session controls per run throughbrowserSettings:
- Context lets a run use a Browserbase context and optionally persist it after browsing.
- Proxies route traffic through Browserbase proxies.
- Verified enables Browserbase Verified for the session.
The browser session
The agent may skip the browser entirely if it can complete the task through web search or fetch alone.
sessionId in the run response, using the same Session APIs available to any session.
Because each run is a real browser session, you get full observability:
- Live View lets you watch the agent browse in real time.
- Session Replay lets you review what the agent did in the Session Inspector.
- Logs capture console output and network activity from the session.
Results and messages
When a run reachesCOMPLETED, Get a run returns:
result: the agent’s structured output. Present when you supplied aresultSchema, and conforming to that schema.sessionId: the browser session ID for replay and debugging.
role and a parts array with typed content blocks (text, tool calls, reasoning, and files). Poll this endpoint while the run is active to stream progress, or call it once after completion for the full transcript.
When to use Agents
| Use Agents when | Consider another tool when |
|---|---|
| You want a browser agent to complete a web task from a natural language goal. | You need deterministic, code-driven browser control: use Sessions. |
| You want to scale across many sites without writing one script per target. | You want to deploy custom browser logic as serverless functions: use Functions. |
| You don’t want to maintain Playwright, Stagehand, model, or runtime orchestration. | You only need lightweight read-only context: use Fetch or Search. |
Next steps
Agents quickstart
Create and poll your first agent run
Create an agent
Define a reusable agent with a system prompt and result schema
Session inspector
Debug and replay agent sessions
Agent Identity
Get agents past anti-bot systems, CAPTCHAs, and auth walls