OpenAI Computer Use Agent
Integrate OpenAI CUA with Browserbase for scalable browser automation
Overview
This guide walks you through integrating OpenAI’s Computer Use Agent (CUA) with Browserbase for seamless cloud-based browser automation.
CUA is a cutting-edge AI model that can see the screen, understand context, and take actions within a browser—enabling advanced automation and interaction with web applications. By pairing CUA with Browserbase’s scalable remote browser infrastructure, you can run AI-powered automation effortlessly in the cloud.
Try out the Computer Use Agent now: cua.browserbase.com
Prerequisites
- OpenAI API key with Computer Use Agent access
- Browserbase account and API key
- Python 3.8+
Basic Integration
This basic setup will get you up and running with a CUA agent using Browserbase as the underlying browser automation platform.
Clone the repository
Install the required packages
Set the environment variables
Run the agent
Update the prompt in your cli to change the behavior of the agent
Customizing the CUA Agent
The CUA agent can be customized by updating the flags in the CLI:
--input
: The initial input to the agent (optional: the CLI will prompt you for input if not provided)--debug
: Enable debug mode.--show
: Show images (screenshots) during the execution.--start-url
: Start the browsing session with a specific URL (only for browser environments). By default, the CLI will start the browsing session withhttps://bing.com
.
Related Resources
Was this page helpful?