Overview

This guide walks you through integrating OpenAI’s Computer Use Agent (CUA) with Browserbase for seamless cloud-based browser automation.

CUA is a cutting-edge AI model that can see the screen, understand context, and take actions within a browser—enabling advanced automation and interaction with web applications. By pairing CUA with Browserbase’s scalable remote browser infrastructure, you can run AI-powered automation effortlessly in the cloud.

Try out the Computer Use Agent now: cua.browserbase.com

Prerequisites

  • OpenAI API key with Computer Use Agent access
  • Browserbase account and API key
  • Python 3.8+

Basic Integration

This basic setup will get you up and running with a CUA agent using Browserbase as the underlying browser automation platform.

1

Clone the repository

git clone https://github.com/openai/openai-cua-sample-app.git
2

Install the required packages

pip install -r "requirements.txt"
3

Set the environment variables

BROWSERBASE_PROJECT_ID=YOUR_PROJECT_ID
BROWSERBASE_API_KEY=YOUR_API_KEY
OPENAI_API_KEY=YOUR_OPENAI_API_KEY
OPENAI_ORG=YOUR_OPENAI_ORG
4

Run the agent

Update the prompt in your cli to change the behavior of the agent

python cli.py --computer browserbase --input "go to hackernews, tell me the top news"

Customizing the CUA Agent

The CUA agent can be customized by updating the flags in the CLI:

  • --input: The initial input to the agent (optional: the CLI will prompt you for input if not provided)
  • --debug: Enable debug mode.
  • --show: Show images (screenshots) during the execution.
  • --start-url: Start the browsing session with a specific URL (only for browser environments). By default, the CLI will start the browsing session with https://bing.com.

Was this page helpful?