Skip to main content
BrowseGPT is a tool that lets you search the web using a chat interface. It’s built on top of the Vercel AI SDK and Browserbase.

What this tutorial covers

  • Access and extract website posts and contents using Browserbase
  • Use the Vercel AI SDK to create a chat interface
  • Stream the results from the LLM

Usage

To use BrowseGPT, you need to have the Vercel AI SDK and Browserbase installed. The following packages are recommended:
npm install ai @ai-sdk/openai @ai-sdk/anthropic @ai-sdk/react zod playwright @mozilla/readability jsdom

Getting started

For this tutorial, you’ll need:
  1. Browserbase credentials:
  2. An LLM API key from one of the following:
Browserbase sessions often run longer than 15 seconds. By signing up for the Pro Plan on Vercel, you can increase the Vercel function duration limit.

Imports and dependencies

Nextjs uses Route Handlers to handle API requests. These include methods such as GET, POST, PUT, DELETE, etc. To create a new route handler, create a new file in the app/api directory. In this example, the file is called route.ts for the chat route. From here, import the necessary dependencies.
route.ts
import { openai } from "@ai-sdk/openai";
import { streamText, convertToModelMessages, tool, generateText } from "ai";
import { z } from "zod";
import { chromium } from "playwright";
import { anthropic } from "@ai-sdk/anthropic";
import { Readability } from "@mozilla/readability";
import { JSDOM } from "jsdom";
This section imports necessary libraries and modules for the application. It includes the Vercel AI SDK, Zod for schema validation, Playwright for web automation, and libraries for content extraction and processing.

Helper functions

These are utility functions used throughout the application. getDebugUrl fetches debug information for a Browserbase session, while createSession initializes a new Browserbase session for web interactions.
// Get the debug URL for a Browserbase session
async function getDebugUrl(id: string) {
  const response = await fetch(
    `https://api.browserbase.com/v1/sessions/${id}/debug`,
    {
      method: "GET",
      headers: {
        "x-bb-api-key": process.env.BROWSERBASE_API_KEY,
        "Content-Type": "application/json",
      },
    },
  );
  const data = await response.json();
  return data;
}

// Create a new Browserbase session
async function createSession() {
  const response = await fetch(`https://api.browserbase.com/v1/sessions`, {
    method: "POST",
    headers: {
      "x-bb-api-key": process.env.BROWSERBASE_API_KEY,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      keepAlive: true,
    }),
  });
  const data = await response.json();
  return { id: data.id, debugUrl: data.debugUrl };
}

Main API route handler

This section sets up the main API route handler. It configures the runtime environment, sets a maximum duration for the API call, and defines the POST method that handles incoming requests. The Vercel AI SDK’s streamText function processes messages and streams responses. The maximum duration is set to 300 seconds (5 minutes), since Browserbase sessions often run longer than 15 seconds (Vercel’s default timeout).
route.ts
// Set the maximum duration to 300 seconds (5 minutes)
export const maxDuration = 300;

// POST method to handle incoming requests
export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = streamText({
    model: openai("gpt-4.1"),
    messages: await convertToModelMessages(messages),
    tools: {
      // ... (tool definitions)
    },
  });

  return result.toUIMessageStreamResponse();
}

Tools

Next, create the tools needed for this route handler. These tools are used depending on the user’s request. For example, if the user wants to search the web, the googleSearch tool handles it. If they want to get the content of a page, the getPageContent tool is used. Keep in mind that you have the option to choose any LLM model that is compatible with the Vercel AI SDK. In testing, gpt-4.1 worked best for tool calling, and claude-sonnet-4-6 worked best for generating responses.

Create Browserbase session tool

This tool creates a new Browserbase session. It’s used when a fresh browsing context is needed for web interactions. The tool returns the session ID and debug URL, which are used in subsequent operations.
createSession: tool({
  description: 'Create a new Browserbase session',
  inputSchema: z.object({}),
  execute: async () => {
    const session = await createSession();
    const debugUrl = await getDebugUrl(session.id);
    return { sessionId: session.id, debugUrl: debugUrl.debuggerFullscreenUrl, toolName: 'Creating a new session'};
  },
}),
The createSession() and getDebugUrl() functions from earlier create a new Browserbase session and get the debug URL. This lets you embed the debug URL in the response so the frontend can display the Browserbase session.

Google Search tool

This tool performs a web search using Browserbase. It takes a search query as input and returns the search results.
googleSearch: tool({
  description: 'Search Google for a query',
  inputSchema: z.object({
    // ... (similar parameters as createSession tool)
  }),
  execute: async ({ query, sessionId }) => {
    // ... (debug URL and browser connection setup)

    const defaultContext = browser.contexts()[0];
    const page = defaultContext.pages()[0];

    await page.goto(`https://www.google.com/search?q=${encodeURIComponent(query)}`);
    await page.waitForTimeout(500);
    await page.keyboard.press('Enter');
    await page.waitForLoadState('load', { timeout: 10000 });

    await page.waitForSelector('.g');

    const results = await page.evaluate(() => {
        const items = document.querySelectorAll('.g');
        return Array.from(items).map(item => {
        const title = item.querySelector('h3')?.textContent || '';
        const description = item.querySelector('.VwiC3b')?.textContent || '';
        return { title, description };
        });
    });

    const text = results.map(item => `${item.title}\n${item.description}`).join('\n\n');

    const response = await generateText({
        model: anthropic('claude-sonnet-4-6'),
        prompt: `Evaluate the following web page content: ${text}`,
    });

    return {
        toolName: 'Searching Google',
        content: response.text,
        dataCollected: true,
    };
  },
}),

Ask for confirmation tool

This tool asks the user for confirmation before performing a specific action. It takes a confirmation prompt as input and returns the user’s response.
askForConfirmation: tool({
  description: 'Ask the user for confirmation.',
  inputSchema: z.object({
    message: z.string().describe('The message to ask for confirmation.'),
  }),
}),

Get page content tool

The last tool is getPageContent. This tool retrieves the content of a web page using Playwright. It then uses jsdom to parse the HTML content into a DOM structure and Readability to extract the main content of the page. Finally, it uses the Anthropic Claude model to generate a summary of the page’s content.
getPageContent: tool({
  description: 'Get the content of a page using Playwright',
  inputSchema: z.object({
    url: z.string().describe('The URL of the page to fetch content from'),
    sessionId: z.string().describe('The Browserbase session ID to use'),
  }),
  execute: async ({ url, sessionId }) => {
    // Get debug URL and connect to Browserbase session
    const debugUrl = await getDebugUrl(sessionId);
    const browser = await chromium.connectOverCDP(debugUrl.debuggerFullscreenUrl);

    // Get the default context and page
    const defaultContext = browser.contexts()[0];
    const page = defaultContext.pages()[0];

    // Navigate to the specified URL
    await page.goto(url, { waitUntil: 'networkidle' });

    // Get the page content
    const content = await page.content();

    // Use Readability to extract the main content
    const dom = new JSDOM(content);
    const reader = new Readability(dom.window.document);
    const article = reader.parse();

    let extractedContent = '';
    if (article) {
      // If Readability successfully parsed the content, use it
      extractedContent = article.textContent;
    } else {
      // Fallback: extract all text from the body
      extractedContent = await page.evaluate(() => document.body.innerText);
    }

    // Generate a summary using the Anthropic Claude model
    const response = await generateText({
      model: anthropic('claude-sonnet-4-6'),
      prompt: `Summarize the following web page content: ${extractedContent}`,
    });

    // Return the structured response
    return {
      toolName: 'Getting page content',
      content: response.text,
      dataCollected: true,
    };
  },
}),

Frontend

Now that the tools and route handler are set up, you can create the frontend. Use the useChat hook to create a chat interface. Here’s a simple example of how to use BrowseGPT in a Next.js frontend application:
'use client';

import { useChat } from '@ai-sdk/react';
import { DefaultChatTransport } from 'ai';
import { useState, useEffect } from 'react';

export default function Chat() {
  const [input, setInput] = useState('');
  const { messages, sendMessage, status } = useChat({
    transport: new DefaultChatTransport({ api: '/api/chat' }),
  });

  const isLoading = status === 'streaming' || status === 'submitted';

  const [showAlert, setShowAlert] = useState(false);
  const [statusMessage, setStatusMessage] = useState('');
  const [sessionId, setSessionId] = useState<string | null>(null);

  useEffect(() => {
    if (isLoading) {
      setShowAlert(true);
      setStatusMessage('The AI is currently processing your request. Please wait.');
      setSessionId(null);
    } else {
      setShowAlert(false);
    }
  }, [isLoading, messages]);

  useEffect(() => {
    const lastMessage = messages[messages.length - 1];
    if (lastMessage?.parts) {
      for (const part of lastMessage.parts) {
        if (part.type === 'tool-invocation' && 'result' in part && part.result?.sessionId) {
          setSessionId(part.result.sessionId);
          break;
        }
      }
    }
  }, [messages]);

  const handleSubmit = (e: React.FormEvent<HTMLFormElement>) => {
    e.preventDefault();
    if (!input.trim()) return;
    void sendMessage({ text: input });
    setInput('');
  };

  return (
    <div className="flex flex-col min-h-screen">
      <div className="flex-grow flex flex-col w-full max-w-xl mx-auto py-4 px-4">
        {messages.map((m) => (
          <div key={m.id} className="whitespace-pre-wrap">
            <strong>{m.role === 'user' ? 'User: ' : 'AI: '}</strong>
            {m.parts?.map((part, i) =>
              part.type === 'text' ? <p key={i}>{part.text}</p> : null
            )}
          </div>
        ))}

        {showAlert && (
          <div className="my-4">
            <p>{statusMessage}</p>
          </div>
        )}
      </div>

      <div className="w-full max-w-xl mx-auto px-4 py-4">
        <form onSubmit={handleSubmit} className="flex">
          <input
            className="flex-grow p-2 border border-gray-300"
            value={input}
            placeholder="Ask anything..."
            onChange={(e) => setInput(e.target.value)}
          />
          <button type="submit" disabled={!input.trim()}>
            Send
          </button>
        </form>
      </div>
    </div>
  );
}

Conclusion

You’ve now seen how to use the Vercel AI SDK to create a chat interface that searches the web using Browserbase. You can view a demo of this tutorial here. The code for this tutorial is open-sourced here.

BrowseGPT demo

Demo of BrowseGPT that allows you to search the web using a chat interface.

BrowseGPT repository

BrowseGPT is a tool that allows you to search the web using a chat interface.