Skip to main content

How to connect Browse AI to any LLM (OpenAI, Gemini, Mistral, Llama)

M
Written by Melissa Shires

Browse AI works with any LLM that has an API. Whether you prefer OpenAI's GPT models, Google's Gemini, Mistral, Llama (via together.ai or Groq), or any other provider, the integration pattern is the same: Browse AI extracts the data, your pipeline sends it to the LLM for processing, and the result goes wherever you need it.

πŸ“– For in-depth guides with full code examples covering enrichment, analysis, and automation patterns, see our dedicated Claude integration series. This article shows how to adapt those patterns to other LLM providers.

The universal pattern

Regardless of which LLM you use, the integration follows the same three steps:

  1. Browse AI extracts data from the web (via a task or monitor).

  2. Your pipeline sends the data to an LLM with a prompt describing what to do with it.

  3. The LLM's response is routed to your destination (spreadsheet, CRM, Slack, email, database).

You can trigger this pipeline three ways:

Trigger method

How it works

Best for

Zapier / Make

No-code automation platform connects Browse AI to your LLM

Non-technical users, quick setups

Webhooks

Browse AI sends data to your server in real time when a task completes

Real-time processing, custom logic

API polling

Your script fetches completed tasks from Browse AI on a schedule

Batch processing, scheduled reports

Zapier / Make setup by provider

All major LLM providers have Zapier and Make integrations. The setup is nearly identical across providers:

  1. Create a workflow with Browse AI as the trigger (New Successful Task Run).

  2. Add your LLM provider as an action step.

  3. Map Browse AI data fields into your prompt.

  4. Add a destination step for the output.

Provider

Zapier app name

Make module

OpenAI

ChatGPT / OpenAI

OpenAI (ChatGPT, DALL-E, Whisper)

Google Gemini

Google Gemini

Google AI (Gemini)

Anthropic Claude

Claude (Anthropic)

Anthropic (Claude)

Mistral

Mistral AI

HTTP module (custom API call)

Webhook + LLM API code examples

Below is the same webhook endpoint pattern adapted for each provider. Each example receives Browse AI data and sends it to the LLM for processing.

Base webhook structure (shared across all providers)

from flask import Flask, request, jsonifyapp = Flask(__name__)@app.route("/webhook/browse-ai", methods=["POST"])
def handle_webhook():
    event = request.json    if event.get("event") != "taskFinishedSuccessfully":
        return jsonify({"status": "ignored"}), 200    task = event.get("task", {})
    captured_data = task.get("capturedTexts", {})    # Process with your chosen LLM (see provider examples below)
    result = process_with_llm(captured_data)    # Do something with the result
    save_result(task.get("id"), result)    return jsonify({"status": "processed"}), 200if __name__ == "__main__":
    app.run(port=5000)

⚠️ Browse AI does not support webhook signature verification. To verify that webhook requests are from Browse AI, allowlist IP address 3.228.254.190. See our webhook IP allowlisting guide.

OpenAI (GPT-4o, GPT-4o mini)

# pip install openai
from openai import OpenAI
import jsonclient = OpenAI(api_key="your-openai-api-key")def process_with_llm(data):
    response = client.chat.completions.create(
        model="gpt-4o",  # or "gpt-4o-mini" for faster/cheaper
        messages=[
            {"role": "system", "content": "You are a data analysis assistant. Respond with valid JSON only."},
            {"role": "user", "content": f"""Analyze this scraped data and return:
{{"summary": "one sentence", "category": "category", "sentiment": "positive|neutral|negative"}}Data: {json.dumps(data)}"""}
        ],
        response_format={"type": "json_object"}  # Enforces JSON output
    )
    return json.loads(response.choices[0].message.content)

Key differences from Claude:

  • Uses response_format={"type": "json_object"} for structured output (must also mention JSON in the prompt)

  • Response is at response.choices[0].message.content

  • Get your API key from platform.openai.com

Google Gemini

# pip install google-generativeai
import google.generativeai as genai
import jsongenai.configure(api_key="your-gemini-api-key")
model = genai.GenerativeModel("gemini-2.0-flash")def process_with_llm(data):
    response = model.generate_content(
        f"""Analyze this scraped data and return a JSON object with:
- "summary": one sentence summary
- "category": primary category
- "sentiment": positive, neutral, or negativeRespond with valid JSON only, no other text.Data: {json.dumps(data)}"""
    )
    return json.loads(response.text)

Key differences:

  • Uses a model object rather than a client with messages

  • Simpler API surface (single string input for basic use)

  • Response is at response.text

  • Get your API key from aistudio.google.com

Mistral

# pip install mistralai
from mistralai import Mistral
import jsonclient = Mistral(api_key="your-mistral-api-key")def process_with_llm(data):
    response = client.chat.complete(
        model="mistral-large-latest",  # or "mistral-small-latest" for faster/cheaper
        messages=[
            {"role": "system", "content": "You are a data analysis assistant. Respond with valid JSON only."},
            {"role": "user", "content": f"""Analyze this scraped data and return:
{{"summary": "one sentence", "category": "category", "sentiment": "positive|neutral|negative"}}Data: {json.dumps(data)}"""}
        ],
        response_format={"type": "json_object"}
    )
    return json.loads(response.choices[0].message.content)

Key differences:

  • Very similar to OpenAI's API structure (messages array, response_format)

  • Uses client.chat.complete instead of client.chat.completions.create

  • Get your API key from console.mistral.ai

Llama (via Groq)

# pip install groq
from groq import Groq
import jsonclient = Groq(api_key="your-groq-api-key")def process_with_llm(data):
    response = client.chat.completions.create(
        model="llama-3.3-70b-versatile",
        messages=[
            {"role": "system", "content": "You are a data analysis assistant. Respond with valid JSON only."},
            {"role": "user", "content": f"""Analyze this scraped data and return:
{{"summary": "one sentence", "category": "category", "sentiment": "positive|neutral|negative"}}Data: {json.dumps(data)}"""}
        ],
        response_format={"type": "json_object"}
    )
    return json.loads(response.choices[0].message.content)

Key differences:

  • OpenAI-compatible API (same structure as OpenAI examples)

  • Extremely fast inference (Groq's LPU hardware)

  • Open-source models with no data retention

  • Get your API key from console.groq.com

Choosing a provider

Provider

Strengths

Best for

OpenAI (GPT-4o)

Largest ecosystem, strong tool use, image understanding

General-purpose processing, multimodal data

Anthropic (Claude)

200K context window, strong reasoning, reliable structured output

Large documents, complex analysis, detailed reports

Google (Gemini)

1M+ context, multimodal, tight Google ecosystem integration

Very large datasets, Google Workspace users

Mistral

Strong multilingual support, EU data residency options

Multi-language data, European compliance requirements

Llama (via Groq)

Fastest inference, open-source, no data retention

High-volume processing, data privacy sensitive workflows

βœ… Tip: You can mix providers in the same pipeline. For example, use Llama via Groq for high-volume triage (fast and cheap), then send flagged items to Claude or GPT-4o for deeper analysis.

API polling: batch processing with any LLM

The polling pattern works identically across providers. Swap in the process_with_llm function from any example above:

import requests, json, timeBROWSE_AI_API_KEY = "your-browse-ai-api-key"
ROBOT_ID = "your-robot-id"def get_recent_tasks(robot_id):
    response = requests.get(
        f"https://api.browse.ai/v2/robots/{robot_id}/tasks",
        headers={"Authorization": f"Bearer {BROWSE_AI_API_KEY}"},
        params={"page": 1}
    )
    return response.json().get("result", {}).get("robotTasks", {}).get("items", [])def batch_process():
    tasks = get_recent_tasks(ROBOT_ID)
    results = []    for task in tasks:
        if task.get("status") != "successful":
            continue        captured = task.get("capturedTexts", {})
        if not captured:
            continue        # Use any provider's process_with_llm function here
        result = process_with_llm(captured)
        results.append({"task_id": task["id"], "result": result})        time.sleep(1)  # Respect rate limits    return resultsresults = batch_process()
for r in results:
    print(f"Task {r['task_id']}: {r['result']}")

Next steps

Did this answer your question?