Unlike screenshots and PDFs which are saved locally, files downloaded during browser automation are stored in Browserbase’s cloud storage. These files must be retrieved using our API.

A typical use case for headless browsers is downloading files from web pages. Our browsers are configured to sync any file you download to our storage infrastructure. We add a Unix timestamp onto the end of the file name to avoid naming conflicts when downloading multiple files (e.g., sample.pdf will become sample-1719265797164.pdf).

Triggering Downloads

First, trigger a download in your browser automation:

  1. Create a browser session and get the session ID

  2. Connect to the session using your preferred framework

  3. Configure your library’s downloads location

  4. Perform the download action in your automation script

import { chromium } from "playwright-core";
import { Browserbase } from "@browserbasehq/sdk";

(async () => {
  const bb = new Browserbase({ apiKey: process.env.BROWSERBASE_API_KEY! });
  const session = await bb.sessions.create({
    projectId: process.env.BROWSERBASE_PROJECT_ID!,
  });

  const browser = await chromium.connectOverCDP(session.connectUrl);
  const defaultContext = browser.contexts()[0];
  const page = defaultContext.pages()[0];

  // Required to avoid playwright overriding location
  const client = await defaultContext.newCDPSession(page);
  await client.send("Browser.setDownloadBehavior", {
    behavior: "allow",
    downloadPath: "downloads",
    eventsEnabled: true,
  });

  await page.goto("https://browser-tests-alpha.vercel.app/api/download-test");

  const [download] = await Promise.all([
    page.waitForEvent("download"),
    page.locator("#download").click(),
  ]);

  let downloadError = await download.failure();
  if (downloadError !== null) {
    console.log("Error happened on download:", downloadError);
    throw new Error(downloadError);
  }

  // Store the session ID to retrieve downloads later
  console.log("Download completed. Session ID:", session.id);

  await page.close();
  await browser.close();
})().catch((error) => console.error(error.message));

Retrieving Downloaded Files

After triggering downloads in your browser session, you can retrieve them using the Session Downloads API. The files are returned as a ZIP archive.

We sync the files in real-time; the size of your downloads might affect their immediate availability through the /downloads endpoint. The code below includes retry logic to handle this case.

import { writeFileSync } from "node:fs";
import { Browserbase } from "@browserbasehq/sdk";

async function saveDownloadsOnDisk(sessionId: string, retryForSeconds: number) {
  return new Promise<void>((resolve, reject) => {
    let pooler: any;
    const timeout = setTimeout(() => {
      if (pooler) {
        clearInterval(pooler);
      }
    }, retryForSeconds);

    async function fetchDownloads() {
      try {
        const bb = new Browserbase({ apiKey: process.env.BROWSERBASE_API_KEY! });
        const response = await bb.sessions.downloads.list(sessionId);
        const downloadBuffer = await response.arrayBuffer();
        
        if (downloadBuffer.byteLength > 0) {
          writeFileSync("downloads.zip", Buffer.from(downloadBuffer));
          clearInterval(pooler);
          clearTimeout(timeout);
          resolve();
        }
      } catch (e) {
        clearInterval(pooler);
        clearTimeout(timeout);
        reject(e);
      }
    }
    pooler = setInterval(fetchDownloads, 2000);
  });
}

(async () => {
  // Use the session ID from your browser automation to retrieve downloads
  const sessionId = "your-session-id";
  await saveDownloadsOnDisk(sessionId, 20000); // wait up to 20s
  console.log("Downloaded files are in downloads.zip");
})().catch(error => {
  console.error('Download failed:', error);
});

Session Downloads API

Learn more about the available params and response fields