XII Essential Browse AI Concepts
Browse AI empowers you to extract and monitor data from any website in minutes, without writing a single line of code. Our intelligent robots interact with websites just like humans, ensuring accurate data extraction with minimal effort.
This article introduces 12 key concepts to help you unlock the full potential of Browse AI.
📑 Table of Contents:
I. Robot 🤖
A robot automates actions you'd typically perform manually on a website. Think of it as your digital butler, capable of the following:
- Opening webpages
- Robots are highly adaptable. You can adjust their settings to interact with different pages on a website or even across multiple websites with similar layouts. They also handle tasks like solving CAPTCHAs and using proxies to avoid detection
- Robots do a lot more in the background without you noticing. For example, as mentioned prior, they solve captchas. They also use geolocated residential IP addresses, emulate human behavior so they do not get detected, and automatically adapt to website changes and essentially maintain themselves.
- See: How do you make sure my data is secure?
- Logging in to accounts
- If your data is behind login, robots can securely log into your account, and you can train it to extract your desired data.
- See: Can my robot login to websites? (Cookies vs. Credentials)
- Clicking buttons and filling forms
- A robot has dynamic input parameters that allow you to adjust the page URL (a.k.a. Origin URL) or what it enters in text inputs every time you run it. This allows you to use a single robot to extract or monitor data from an unlimited number of pages on a site that have a similar layout.
- See: Can my robot fill out a form or perform an action before extracting data?
- Extracting data into spreadsheets
- You can choose to download the task result, or integrate your robot to a spreadsheet (E.g., Google Sheets)
- See: How can I enable my Google Sheet integration?
- Navigating through pages
- E.g., clicking on "Next page" or "Load more" buttons to extract more data,
- See: How do I handle pagination and scrolling?
- Taking screenshots
- Monitoring website changes
- Bulk running tasks
- Most robots that users build simply open a webpage and extract data from it. These robots can also run hundreds or thousands of tasks all at once on similar pages on that site to quickly extract the full data set.
II. Types of Robots: Prebuilt and Custom
A. Prebuilt Robots
- These are ready-made robots for popular use cases like extracting data from Yelp, TripAdvisor, or LinkedIn.
- These robots have a few input parameters (E.g., webpage address, data limit, etc.) that you can configure per run.
- See the list of our Prebuilt Robots
B. Custom Robots
- These are the robots you train using Browse AI's click-and-extract interface. These are tailoured to your specific needs and can automate unique workflows.
- Over 90% of robots that Browse AI users have are Custom Robots they build for their specific use cases.
- For example, there are realtors that monitor building permits issued by their county's government (on the county website) and connect that to a sales CRM or spreadsheet to send emails automatically to every builder that receives a building permit.
- See: How can I build a robot?
III. Origin URL 🌎
The Origin URL is the web address where your robot begins its task.
- Every Custom Robot comes with an Origin URL input parameter that points to the link they were trained on by default.
- By adjusting this parameter, you can use a single robot to work on multiple pages with similar layouts.
- For example, if you are looking to monitor Walmart's product prices, you can train a custom robot on a Walmart product page, and then configure that robot to monitor 100 different product pages by adjusting the Origin URL for each monitor.
IV. Task 💼
Each robot is trained for a specific task. Every time you run it, it completes that task, and the results are stored in the History tab of the respective robot.
Tasks can be created in a few different ways:
- You can choose a robot on your dashboard, go to its Run Task tab, and run a normal task.
- In the Run Task tab, you also have the option to Bulk Run up to 50,000 tasks at once by uploading a CSV file.
- See: What does the "Bulk Run" feature do?
- If you configure monitors, every time a monitoring check needs to be done, a new monitoring task is automatically created.
- For example, should you set up a monitoring robot to monitor a webpage for changes daily, it will have to run a task every day or about 30 tasks per month for you.
- See: How can I setup monitoring for changes?
- If you integrate Browse AI with another software or use the API, new tasks can be created using the API.
- See: API Documentation
- Sometimes, the system creates tasks to make sure the robot is healthy or to optimise the robot and make it faster or more reliable. The tasks will be marked as run by "the system".
V. Monitor 👓
One of the most useful features of Browse AI is the built-in monitoring system.
- Every robot can have an unlimited number of monitors configured, each for one page or one search criteria on that site that needs to be monitored.
- Monitors automatically track changes on web pages and notify you when updates occur.
- This can also be configured to send the data to another software automatically when a change is detected.
- For example, you can monitor all products on an e-commerce site with a single robot and receive a notification when the prices change or a product becomes available.
- See: How can I setup monitoring for changes?
VI. Input Parameters 🔡🔢
Every robot comes with certain input parameters you can adjust for every task and monitor, so you do not have to create new robots for every page or search keyword on a site.
- The most common input parameter is Origin URL which refers to the page that the robot opens first.
- When you train a custom robot, if you interact with any text inputs, what you enter will become an input parameter that you can adjust later.
VII. Bulk Run 🚀
The Bulk Run feature, which can be found in the robot dashboard under the Run Task tab, allows you to upload a CSV containing up to 50,000 different sets of Input Parameters and immediately create a task for each one of them.
- The tasks will then be processed in a queue and you will get the full data set they have extracted once they are finished.
- You can have a holistic view on these results, as well as export the very same results, via our Tables
- See: How to Use the Tables Feature
- This feature allows you to E.g., upload a CSV containing links to 50,000 company pages on LinkedIn, and then get a spreadsheet of all the data that was extracted from those 50,000 pages.
VIII. Deep Scraping 🧬
Deep scraping extracts data from multiple levels of a website, like going from a product list to each product's details page. This allows you to collect comprehensive data from complex websites.
- See: How can I extract data from lists and their associated details pages? (Deep scraping)
- One of the most useful feature that can help you deep-scrape more efficiently is Workflows. You can chain two robots together, which allow data to flow seamlessly from one robot to the other, triggering automating tasks that require multiple steps.
IX. Integrations 🧩
Most of the time, you want to transfer the data you find on a website to another software you use, like Google Sheets or your CRM.
- Browse AI comes with 5,000+ integrations to make it possible to create a data pipeline from any website into the tools you already use.
- See: What Integrations are available on Browse AI?
The last three key definitions are related to Integrations 👇
X. Google Sheets and Airtable 📊
These two are native spreadsheet integrations on Browse AI.
- Once you configure them on a robot, every time the robot runs a task, the data it extracts is inserted into your spreadsheet immediately.
Browse AI for Google Sheets add-on:
- This add-on offers extra features in Google Sheets, such as:
- Run robots from within your Google Sheet by highlighting a set of Input Parameters and pressing a button,
- Automatically delete old data in your Google Sheet,
- Automatically delete duplicate data in your Google Sheet.
- See: How can I use the Google Sheets add-on?
XI. Connector Integrations — Zapier, Make.com, Pabbly ⚙
These native integrations allow you to connect Browse AI to 5,000+ other apps with a few clicks through a third-party integration software:
- Zapier is the easiest one to use, but it can be costly at large volumes.
- Make.com costs much less, but is harder to use.
- Pabbly Connect is typically used by people who have purchased their one-time payment lifetime deal to save on costs.
XII. API & Webhooks ðŸ›
If you have a development team, you can leverage Browse AI's API and Webhooks to extend the platform's capabilities. You can automate tasks, integrate with your existing systems, and even build custom solutions tailored to your specific needs. This programmatic access allows you to offload repetitive data extraction tasks and focus on your core business logic.
- Many startups are already leveraging our API to streamline their operations and gain a competitive edge.
- Note: Creating new robots is currently not supported via the API
- See: API documentation