One Robot to Extract Them All 🤖

Have you built a Browse AI robot to extract data from a specific category of a website, but now need to gather similar information from other sections with the same structure? You don't need to create a new robot for each category!

Browse AI robots are designed to be flexible. When dealing with a website with consistent structures across different categories or sections (e.g., product pages, blog posts, directory listings), you can often use a single robot to extract data from all of them. This saves you time and effort, streamlining your data extraction process.

Here are three ways on how to leverage your existing robot to extract data from different datasets with the same web structure:

I. Using the Origin URL Field

This method is ideal for quick extractions from a different category or section.

  • Step 1: Open your robot and navigate to the "Run Task" tab.

    Step 2: In the "Origin URL" field, paste the URL of the new category or section page you want to scrape.

    • To keep your robot's original settings: Make sure "Save as default" is not selected. This will only apply the new URL for this specific run.
    • To update your robot's target URL: Select "Save as default." This will permanently change the robot's target URL to the one you entered.
  • Step 3: Run the task by pressing "Enter" or clicking the purple "Run Task" button.

II. Using Bulk Run

For extracting data from multiple categories or sections simultaneously, the bulk run feature is your go-to solution.

  • Step 1: Open your robot and go to the "Run Task" tab.
  • Step 2: Click the "Bulk run tasks" button in the lower right corner.

  • Step 3: Download the sample CSV file. This file provides a template for how to structure your input data, including URLs and any other variables:

  • Step 4: Open the CSV file and add the URLs of all the categories or sections you want to scrape.

    Step 5: Upload the modified CSV file.

    Step 6: Use the "Origin URL" column header to calibrate your links:

    • By choosing the Origin URL column header, all your category links will be displayed and readied for extraction.
  • Step 7: Optionally, you can choose your desired integration (e.g., Google Sheets) for exporting the extracted data.
  • Step 8: Start the bulk run task:

III. Using Monitors with Different Origin URLs

This method allows you to automate data extraction from multiple categories or sections on a schedule.

  • Step 1: Create separate monitors for each category or section you want to track.
  • Step 2: In each monitor's settings, specify the corresponding Origin URL for that specific category or section.

  • Step 3: Configure the monitoring frequency (e.g., daily, hourly) and other settings specific for your use case.
  • Step 4: Click on Save.

This approach ensures that your robot automatically extracts data from different parts of the website based on your defined schedule and keeps your data up-to-date across all categories.

If you have any questions or need assistance with adapting your robots for different datasets, our support team is ready to help!

💪

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.

Still need help? Contact Us Contact Us