Deep scraping helps you gather detailed information by connecting two robots: one that collects lists of items and another that extracts specific details from each item's page.

This approach allows you to build comprehensive datasets across multiple pages from the same website.

How deep scraping works

Deep scraping involves two main steps:

Your first robot scans a list or category page to collect basic information and URLs.
Your second robot visits each URL to extract detailed information from individual pages.

You can learn how to set up deep scraping using workflows here.

Common use cases for deep scraping

E-commerce product analysis: scraping a list of products and details about each one

Robot A collects product information from category pages, including names, prices, and URLs.
Robot B visits each product's URL detail page to gather comprehensive information like specifications, reviews, stock levels, shipping options, etc.

This helps with competitive analysis and price monitoring across product categories.

Real estate market research: scraping multiple property listings and their detailed information

Robot A scans property listing pages for basic details like prices and locations.
Robot B visits individual property pages to collect specifications, amenities, agent information, photos, and property history.

This creates comprehensive property databases for market analysis.

Create a custom business directory database: extracting listings from a directory as well as custom details

Robot A processes directory pages to gather business listings (ex: Yellow Pages, professional associations, etc.).
Robot B visits each business profile to extract contact details, business hours, services, reviews, and social media information.

This helps create detailed contact and/or lead databases for sales and marketing teams.

Using workflows to automate deep scraping

To make deep scraping more efficient, use our Workflows feature to connect your robots. Workflows automatically pass data between robots, creating a seamless extraction process that requires minimal oversight.

See: How can I extract data from lists and their associated details pages? (Deep scraping) and How can I create a workflow connecting two robots?

How can I create a workflow connecting two robots?

Web crawling vs. web scraping

How to approve a robot

How can I extract data from lists and their associated details pages? (deep scraping)

How to use the Import CSV option in Tables to perform a Bulk Run