What You'll Build
An AI-powered data extraction agent that pulls product data from e-commerce sites and stores it in MongoDB with automatic schema validation and data analysis.
Before you start
Ensure you have these requirements ready:Node.js 16+
Required runtime environment
MongoDB
Local install or MongoDB Atlas
API Access
Browserbase API key
Step 1: Project setup
Clone and install
What gets installed?
What gets installed?
Core packages:
@browserbasehq/stagehand- AI-powered data extractionmongodb- MongoDB driver for data storagezod- Schema validation for type safety
chalk&boxen- Terminal styling and output formattingplaywright- Browser automation engine
Step 2: Start MongoDB
If using local MongoDB, ensure it’s running:MongoDB Atlas users can skip this step as the database is already hosted in the cloud.
Step 3: Configuration
Environment variables
Create your.env file with the required configuration:
Step 4: Configure Stagehand
The integration uses Browserbase cloud browsers:stagehand.config.ts
Step 5: Run your first extraction
What happens when you run the agent:- Connects to MongoDB and creates necessary collections
- Navigates to Amazon laptop category
- Extracts product listings with AI-powered data extraction
- Extracts detailed information for the first 3 products
- Stores all data in MongoDB with schema validation
- Runs analysis queries and displays results
Execute the extraction
Customization options
Extend data schema
Add custom fields to capture more product information:Custom extraction instructions
Modify the AI extraction to capture specific data:What’s next?
Now that you have a working MongoDB + Stagehand integration:Scale your extraction
Learn how to scale your data extraction across multiple sites and handle larger datasets.
Deploy to production
Deploy your extraction pipeline to production with Browserbase.
Need help? Join the Stagehand Slack community for support and to share your projects!