Skip to main content
Agents are the highest abstraction layer of the Browserbase platform. You describe a goal in natural language, and Browserbase runs a browser agent that browses, searches, works with files, and returns a result. No Playwright scripts, no framework code, and no infrastructure to deploy.

The execution loop

When you create a run, Browserbase starts an autonomous agent that works in a loop until the task is done:
  1. Observe. The agent reads the current page, its memory of prior steps, and the available tools.
  2. Reason. The agent’s model picks the next action: navigate somewhere, click something, extract data, search the web, run a shell command, or declare success.
  3. Act. The chosen tool executes and its result flows back into the agent’s context.
  4. Repeat. The loop continues until the agent decides the task is complete or a terminal state is reached.
You provide the task; Browserbase handles the model, the browser, the tools, and the runtime.

Built-in tools

The agent has access to four tool groups out of the box. You can’t disable or add to them in the current version.

Browser control

The agent uses Stagehand, the SDK for browser agents, to interact with web pages. It navigates to URLs, clicks elements, types into forms, extracts structured data, and waits for page state to settle. Because the agent reasons about each page rather than following a fixed script, it adapts to layout changes and per-site differences that break selectors. This is the primary tool for most runs. Before opening a browser, the agent can search the web to discover URLs or pull quick context, powered by the Search and Fetch APIs. This lets it find the right starting point rather than guessing.

File system

The agent works with real files, not just web pages. It reads and writes files in a sandboxed workspace, processes downloaded content like PDFs, and produces output such as spreadsheets. When the agent triggers a file download, you retrieve the result through the Downloads API.

Shell

The agent can run commands in the sandboxed Runtime that backs each run. When data transformation, scripting, or CLI tools are faster than driving a browser, the agent uses the shell instead.

Reusable agents and runs

An agent is a reusable configuration; a run is a single execution of a task on a browser session of its own. Reuse is what lets you scale across many targets. Instead of writing and maintaining one script per portal, you create an agent once and run it across hundreds of sites.

Create an agent in the dashboard

The fastest way to build an agent is the Agents page in the dashboard. Describe the goal in natural language, set a system prompt and a structured output schema, then trigger runs and inspect each step live. The dashboard also surfaces success rate, average duration, and an Optimize tool that tunes the agent from its past runs.

Create an agent over the API

You can also define an agent programmatically with Create an agent:
  • systemPrompt gives the agent consistent instructions on every run.
  • resultSchema is a JSON Schema that shapes the agent’s output into typed, repeatable JSON. You can set it on the agent or override it per run.
  • variables pass sensitive or per-run values, such as account numbers, dates of birth, or confirmation codes, as %variable% placeholders the agent fills in without seeing them inline.
Either way, trigger a run with Run an agent by passing the agentId. A POST /v1/agents/runs call with no agentId creates a new agent and its first run in a single call, returning both an agentId and a runId.

Run lifecycle

Runs are asynchronous. A run moves through these states after creation:
PENDING → RUNNING → COMPLETED
                  → FAILED
                  → TIMED_OUT
                  → STOPPED
PENDING and RUNNING are active states: the run is either queued or the agent is working. The remaining states are terminal. Once a run reaches one, it won’t change again. Create a run, poll Get a run until it reaches a terminal state, then read the result.

Configuration and controls

Each run carries production-grade controls. Set supported session controls per run through browserSettings:
  • Context lets a run use a Browserbase context and optionally persist it after browsing.
  • Proxies route traffic through Browserbase proxies.
  • Verified enables Browserbase Verified for the session.

The browser session

The agent may skip the browser entirely if it can complete the task through web search or fetch alone.
Every run gets a dedicated Browserbase browser session. Browserbase creates the session when the run starts and closes it when the run ends. You access it through the sessionId in the run response, using the same Session APIs available to any session. Because each run is a real browser session, you get full observability:
  • Live View lets you watch the agent browse in real time.
  • Session Replay lets you review what the agent did in the Session Inspector.
  • Logs capture console output and network activity from the session.
A Browserbase Function provides the serverless compute that backs each run.

Results and messages

When a run reaches COMPLETED, Get a run returns:
  • result: the agent’s structured output. Present when you supplied a resultSchema, and conforming to that schema.
  • sessionId: the browser session ID for replay and debugging.
To follow what the agent did step by step, use List run messages. It returns messages in chronological order, conforming to the AI SDK UIMessage format. Each message has a role and a parts array with typed content blocks (text, tool calls, reasoning, and files). Poll this endpoint while the run is active to stream progress, or call it once after completion for the full transcript.

When to use Agents

Use Agents whenConsider another tool when
You want a browser agent to complete a web task from a natural language goal.You need deterministic, code-driven browser control: use Sessions.
You want to scale across many sites without writing one script per target.You want to deploy custom browser logic as serverless functions: use Functions.
You don’t want to maintain Playwright, Stagehand, model, or runtime orchestration.You only need lightweight read-only context: use Fetch or Search.

Next steps

Agents quickstart

Create and poll your first agent run

Create an agent

Define a reusable agent with a system prompt and result schema

Session inspector

Debug and replay agent sessions

Agent Identity

Get agents past anti-bot systems, CAPTCHAs, and auth walls