Langchain
Configure Browserbase for Langchain
1
Get your API Key
2
Install the Browserbase SDK
pip install browserbase
3
Load documents or images
Load documents
from langchain_community.document_loaders import BrowserbaseLoader
BROWSERBASE_API_TOKEN = "<Your Browserbase API Key goes here>"
loader = BrowserbaseLoader(
api_token=BROWSERBASE_API_TOKEN,
urls=[
# load multiple pages
"https://www.espn.com",
"https://lilianweng.github.io/posts/2023-06-23-agent/"
],
text_content=True,
)
documents = loader.load()
The default value text_content=False
will return HTML as a LlamaIndex Document
.
Setting text_content=True
will return LlamaIndex Document
with text only.
Load images
from browserbase import Browserbase
from browserbase.helpers.gpt4 import GPT4VImage, GPT4VImageDetail
from langchain_core.messages import HumanMessage
from langchain_openai import ChatOpenAI
chat = ChatOpenAI(model="gpt-4-vision-preview", max_tokens=256)
browser = Browserbase()
screenshot = browser.screenshot("https://browserbase.com")
result = chat.invoke(
[
HumanMessage(
content=[
{"type": "text", "text": "What color is the logo?"},
GPT4VImage(screenshot, GPT4VImageDetail.auto),
]
)
]
)
print(result.content)
By default, the screenshot()
method takes a screenshot of the visible viewport.
To take a full-page screenshot, pass the full_page=True
option.
The reference of the browserbase
package is available on GitHub.
Was this page helpful?