After running your first API task, you need to understand how to access and work with the data your robots extract. This guide covers everything from understanding response formats to filtering large datasets and handling errors.
Understanding task response structure
When your robot successfully extracts data, the API returns it in a structured format. Here's what a typical successful task response looks like:
{
"statusCode": 200,
"messageCode": "success",
"result": {
"id": "f6fb62b6-f06a-4bf7-a623-c6a35c2e70b0",
"status": "successful",
"robotId": "4f5cd7ff-6c98-4cac-8cf0-d7d0cb050b06",
"capturedTexts": {
"title": "Product Name Here",
"price": "$99.99",
"description": "Product description text",
"product_list": [
{
"name": "Item 1",
"price": "$29.99"
},
{
"name": "Item 2",
"price": "$39.99"
}
]
},
"capturedScreenshots": [
{
"name": "full_page",
"url": "https://screenshot-url-here.jpg"
}
],
"createdAt": 1678795867879,
"startedAt": 1678795867879,
"finishedAt": 1678795867879,
"videoUrl": "https://video-debug-url.mp4"
}
}Key fields explained:
capturedTexts: Your extracted data with the field names you configuredcapturedScreenshots: Any screenshots your robot capturedstatus: Current task status (running,successful,failed)videoUrl: Debug video showing exactly what your robot did
Getting task results
Option 1: Get a specific task If you know the task ID (from when you started the task):
curl -X GET "https://api.browse.ai/v2/robots/ROBOT_ID/tasks/TASK_ID" \
-H "Authorization: Bearer YOUR_SECRET_API_KEY"
Option 2: List recent tasks Get your robot's latest tasks:
curl -X GET "https://api.browse.ai/v2/robots/ROBOT_ID/tasks" \
-H "Authorization: Bearer YOUR_SECRET_API_KEY"
Working with different data types
Your capturedTexts field contains different types of data depending on how you configured your robot.
Single text fields
"capturedTexts": {
"title": "Single piece of text",
"price": "$99.99"
}Lists of items
"capturedTexts": {
"product_list": [
{"name": "Item 1", "price": "$29.99"},
{"name": "Item 2", "price": "$39.99"}
]
}Processing lists in your code
import requests
# Get task results
response = requests.get(
f"https://api.browse.ai/v2/robots/{robot_id}/tasks/{task_id}",
headers={"Authorization": f"Bearer {api_key}"}
)
data = response.json()
if data["result"]["status"] == "successful":
products = data["result"]["capturedTexts"]["product_list"]
for product in products:
print(f"Product: {product['name']}, Price: {product['price']}")
Filtering and pagination for large datasets
When you have many tasks, use filters to find exactly what you need.
Filter by status
curl -X GET "https://api.browse.ai/v2/robots/ROBOT_ID/tasks?status=successful" \
-H "Authorization: Bearer YOUR_SECRET_API_KEY"
Filter by date range
curl -X GET "https://api.browse.ai/v2/robots/ROBOT_ID/tasks?fromDate=1678795867879&toDate=1678885867879" \
-H "Authorization: Bearer YOUR_SECRET_API_KEY"
Pagination for large results
curl -X GET "https://api.browse.ai/v2/robots/ROBOT_ID/tasks?page=1&pageSize=10" \
-H "Authorization: Bearer YOUR_SECRET_API_KEY"
Available filters:
status:successful,failed,in-progressfromDate/toDate: Unix timestamps for date rangepage/pageSize: Pagination (1-10 items per page)sort: Sort by creation date (-createdAtfor newest first)robotBulkRunId: Filter by specific bulk run
Handling failed tasks and debugging
When tasks fail, you get valuable debugging information:
{
"statusCode": 200,
"messageCode": "success",
"result": {
"id": "task-id-here",
"status": "failed",
"userFriendlyError": "The page took too long to load",
"videoUrl": "https://debug-video-url.mp4",
"triedRecordingVideo": true
}
}Debugging failed tasks:
Check
userFriendlyError: Plain English explanation of what went wrongWatch the debug video: See exactly what your robot encountered
Verify input parameters: Make sure URLs and parameters are correct
Check website changes: Sites may have changed since robot training
Common failure reasons:
Website is down or loading slowly
Login credentials expired
Website structure changed
Network connectivity issues
Working with monitoring data
When robots run on monitoring schedules, you can identify them in the task list:
{
"result": {
"id": "task-id",
"runByTaskMonitorId": "monitor-id-here",
"runByAPI": false,
"status": "successful"
}
}Monitoring-specific filtering:
# Get only monitoring tasks (exclude manual and API runs)
curl -X GET "https://api.browse.ai/v2/robots/ROBOT_ID/tasks" \
-H "Authorization: Bearer YOUR_SECRET_API_KEY" \
| jq '.result.robotTasks.items[] | select(.runByTaskMonitorId != null)'
Best practices for data management
Use specific date ranges instead of fetching all tasks
Filter by status to focus on successful extractions only
Implement retry logic for temporary failures
Use webhooks instead of polling for real-time updates
