This guide covers three ways to send your Browse AI scraped data into PostgreSQL, from no-code automation to direct database connections. Whether you're using a managed service like Supabase, Neon, or Amazon RDS, or running your own PostgreSQL instance, these methods apply to all setups.
π Prerequisites: You'll need an approved Browse AI robot with scraped data and a PostgreSQL database (self-hosted or managed). For API-based methods, you'll also need a Browse AI API key. If using Supabase, you'll need your project URL and API key.
Which method should I use?
Method | Best for | Technical level | Speed |
Zapier / Make | Teams without developers | No code | Near real-time |
Webhooks + direct insert | Real-time pipelines with custom logic | Intermediate | Real-time |
API polling + batch insert | Batch processing, scheduled syncs | Intermediate | On schedule |
Method 1: Zapier or Make (no code)
The fastest way to connect Browse AI to PostgreSQL. Both Zapier and Make support PostgreSQL as a destination.
Setting up with Zapier
Go to zapier.com and create a new Zap.
Trigger: Choose Browse AI as the trigger app, then select New Successful Task Finished.
Connect your Browse AI account and select the robot you want to sync data from.
Action: Choose PostgreSQL as the action app, then select Create Row.
Connect your PostgreSQL database (host, port, database name, username, password). If using Supabase, find these in Settings β Database.
Select your table and map each Browse AI captured field to the corresponding column.
Test the Zap, then turn it on.
π‘ Supabase users: You can also use the Supabase Zapier integration directly, which connects via Supabase's REST API instead of a direct database connection. This is simpler if your database doesn't allow external connections.
Setting up with Make (formerly Integromat)
Create a new scenario in Make.
Add a Webhooks β Custom webhook module as the trigger. Copy the webhook URL.
In Browse AI, go to your robot's Integrate tab and add the Make webhook URL under Webhooks. Select the
taskFinishedSuccessfullyevent.Add a PostgreSQL β Insert a Row module (or Supabase β Create a Row), connect your database, and map your fields.
Activate the scenario.
Method 2: Webhooks + direct insert (real-time)
Use Browse AI webhooks to push data directly into PostgreSQL as soon as a task finishes. This section covers two approaches: direct connection with psycopg2, and Supabase's REST API.
Step 1: Create your table
CREATE TABLE browse_ai_leads (
id SERIAL PRIMARY KEY,
task_id VARCHAR(255) UNIQUE,
robot_id VARCHAR(255),
first_name VARCHAR(255),
last_name VARCHAR(255),
email VARCHAR(255),
company_name VARCHAR(255),
phone VARCHAR(100),
website VARCHAR(500),
job_title VARCHAR(255),
origin_url TEXT,
scraped_at TIMESTAMPTZ DEFAULT NOW()
);
Option A: Direct PostgreSQL connection (psycopg2)
import psycopg2
from flask import Flask, request, jsonifyapp = Flask(__name__)DB_CONFIG = {
"host": "your-host.example.com",
"port": 5432,
"dbname": "your_database",
"user": "your_username",
"password": "your_password"
}@app.route("/browse-ai-webhook", methods=["POST"])
def handle_webhook():
payload = request.get_json() if payload.get("event") != "taskFinishedSuccessfully":
return jsonify({"status": "ignored"}), 200 task = payload["task"]
captured = task.get("capturedTexts", {}) conn = psycopg2.connect(**DB_CONFIG)
cur = conn.cursor() cur.execute(
"""INSERT INTO browse_ai_leads
(task_id, robot_id, first_name, last_name, email,
company_name, phone, website, job_title, origin_url)
VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
ON CONFLICT (task_id) DO NOTHING""",
(
task.get("id", ""),
task.get("robotId", ""),
captured.get("first_name", ""),
captured.get("last_name", ""),
captured.get("email", ""),
captured.get("company_name", ""),
captured.get("phone", ""),
captured.get("website", ""),
captured.get("job_title", ""),
task.get("inputParameters", {}).get("originUrl", "")
)
) conn.commit()
cur.close()
conn.close() return jsonify({"status": "inserted"}), 200if __name__ == "__main__":
app.run(host="0.0.0.0", port=5000)
π‘ Install psycopg2: Run pip install psycopg2-binary. For production, consider using a connection pool with psycopg2.pool or SQLAlchemy to avoid opening a new connection per request.
Option B: Supabase REST API (no direct database connection needed)
If you're using Supabase, you can insert data via their REST API without exposing your database connection string:
import requests
from flask import Flask, request, jsonifyapp = Flask(__name__)SUPABASE_URL = "https://your-project.supabase.co"
SUPABASE_API_KEY = "your_supabase_anon_or_service_role_key"@app.route("/browse-ai-webhook", methods=["POST"])
def handle_webhook():
payload = request.get_json() if payload.get("event") != "taskFinishedSuccessfully":
return jsonify({"status": "ignored"}), 200 task = payload["task"]
captured = task.get("capturedTexts", {}) row = {
"task_id": task.get("id", ""),
"robot_id": task.get("robotId", ""),
"first_name": captured.get("first_name", ""),
"last_name": captured.get("last_name", ""),
"email": captured.get("email", ""),
"company_name": captured.get("company_name", ""),
"phone": captured.get("phone", ""),
"website": captured.get("website", ""),
"job_title": captured.get("job_title", ""),
"origin_url": task.get("inputParameters", {}).get("originUrl", "")
} resp = requests.post(
f"{SUPABASE_URL}/rest/v1/browse_ai_leads",
headers={
"apikey": SUPABASE_API_KEY,
"Authorization": f"Bearer {SUPABASE_API_KEY}",
"Content-Type": "application/json",
"Prefer": "resolution=merge-duplicates"
},
json=row
) return jsonify({"status": "inserted", "code": resp.status_code}), 200if __name__ == "__main__":
app.run(host="0.0.0.0", port=5000)
π‘ Automatic deduplication with Supabase: The Prefer: resolution=merge-duplicates header tells Supabase to upsert. It matches on the UNIQUE constraint (task_id in our schema) and updates the row if it already exists.
Step 2: Register the webhook in Browse AI
Via the dashboard:
Open your robot and go to the Integrate tab.
Under Webhooks, click Add webhook.
Paste your endpoint URL and select the
taskFinishedSuccessfullyevent.
Via the API:
curl -X POST "https://api.browse.ai/v2/robots/YOUR_ROBOT_ID/webhooks" \
-H "Authorization: Bearer YOUR_BROWSE_AI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://yourdomain.com/browse-ai-webhook",
"events": ["taskFinishedSuccessfully"]
}'
π‘ IP allowlisting: Browse AI sends webhooks from IP address 3.228.254.190. If your server has a firewall, add this to your allowlist. See Webhooks: IP address for allowlisting.
Method 3: API polling + batch insert (scheduled)
Poll the Browse AI API on a schedule and batch-insert results into PostgreSQL.
import psycopg2
import requests
from datetime import datetime, timedeltaBROWSE_AI_API_KEY = "your_browse_ai_api_key"
ROBOT_ID = "your_robot_id"
DB_CONFIG = {
"host": "your-host.example.com",
"port": 5432,
"dbname": "your_database",
"user": "your_username",
"password": "your_password"
}def get_recent_tasks(since_hours=1):
resp = requests.get(
f"https://api.browse.ai/v2/robots/{ROBOT_ID}/tasks",
headers={"Authorization": f"Bearer {BROWSE_AI_API_KEY}"},
params={"pageSize": 100}
)
tasks = resp.json().get("result", {}).get("robotTasks", {}).get("items", [])
cutoff = datetime.utcnow() - timedelta(hours=since_hours)
return [t for t in tasks if t.get("status") == "successful"
and datetime.fromisoformat(t["finishedAt"].replace("Z","")) > cutoff]def batch_insert(tasks):
conn = psycopg2.connect(**DB_CONFIG)
cur = conn.cursor() for task in tasks:
captured = task.get("capturedTexts", {})
cur.execute(
"""INSERT INTO browse_ai_leads (task_id, email, company_name)
VALUES (%s, %s, %s)
ON CONFLICT (task_id) DO NOTHING""",
(task.get("id"), captured.get("email", ""),
captured.get("company_name", ""))
) conn.commit()
cur.close()
conn.close()# Run on a schedule (e.g. cron job every hour)
π For full Browse AI API details, including pagination, bulk operations, and task filtering, see the API Guide: Getting started and API Guide: Bulk operations.
PostgreSQL-specific tips
Built-in deduplication with ON CONFLICT
PostgreSQL's ON CONFLICT clause is the simplest way to handle duplicates. With a UNIQUE constraint on task_id, you can choose to skip duplicates or update existing rows:
-- Skip duplicates
INSERT INTO browse_ai_leads (task_id, email, company_name)
VALUES ('abc123', '[email protected]', 'Acme Inc')
ON CONFLICT (task_id) DO NOTHING;-- Or update on conflict
INSERT INTO browse_ai_leads (task_id, email, company_name)
VALUES ('abc123', '[email protected]', 'Acme Inc')
ON CONFLICT (task_id) DO UPDATE SET
email = EXCLUDED.email,
company_name = EXCLUDED.company_name;
Indexing for performance
Add indexes on columns you'll query frequently:
CREATE INDEX idx_leads_email ON browse_ai_leads (email); CREATE INDEX idx_leads_scraped_at ON browse_ai_leads (scraped_at); CREATE INDEX idx_leads_company ON browse_ai_leads (company_name);
Connection pooling
For webhook-based real-time inserts, avoid opening a new connection per request. Use a connection pool instead:
from psycopg2 import pooldb_pool = pool.ThreadedConnectionPool(
minconn=2, maxconn=10,
host="your-host", dbname="your_db",
user="your_user", password="your_password"
)# In your webhook handler:
conn = db_pool.getconn()
# ... execute query ...
db_pool.putconn(conn)
Troubleshooting
Connection refused or timeout
Check that your PostgreSQL instance allows external connections. For cloud-hosted databases (Supabase, Neon, RDS), verify that your server's IP is in the allowed list. For Supabase specifically, enable the "Allow connections from outside" option in your project settings.
SSL connection errors
Most managed PostgreSQL services require SSL. Add sslmode=require to your connection string or pass "sslmode": "require" in your connection config.
Webhook isn't firing
Make sure the webhook URL is publicly accessible and that your server responds with a 200 status code. See the Webhooks: Set up guide for detailed debugging steps.
