This guide focuses on training robots to search, handle results, and extract exactly what you need from search result pages.
Understanding search-based extraction
Search-based extraction involves three key steps:
Entering a search query into a website's search box
Retrieving the results that match your query
Extracting the data from those search results
This is different from browsing directly to a page - you're using the site's own search to filter and find specific information.
When to use search extraction
Use search extraction for: | Use direct navigation when: |
Finding specific products in large catalogs | You have direct URLs to pages |
Researching companies in directories | Content is on static pages |
Locating articles or documents | You need all items, not filtered results |
Filtering data by criteria | The site has no search function |
Accessing data that's only findable via search | You're monitoring specific pages |
Training your search robot
Step 1: Start with your search
Start training your robot and navigate to the website with the search function.
Locate the search box (usually in header or prominently displayed).
Click into the search field to focus it
Type your search term (e.g., "laptop")
Type naturally, don't paste.
You can train a robot to use multiple search fields if needed.
This creates an input parameter automatically.
Press Enter or click Search to submit
π‘ Your robot will be trained to follow the same steps you take so fill out the search naturally as if you're searching for this information.
Step 2: Wait for results to load
Let the results fully load before proceeding. Watch for:
Loading spinners to disappear
Result count to appear ("Showing 1-50 of 235 results")
All result elements to render
Any filters or sorting options to load
Step 3: Extract your data
Before extracting, understand what you're working with so you can figure out how to structure and extract the data you need.
Result type | When to use | Extraction method | What to extract |
Multiple results | Lists of search results | β’ Title/Name | |
Single result | One featured result | β’ Result count | |
Mixed results | Combination of lists | Both.
| β’ List: Main results |
π‘ You can also train a second robot to extract details from a detail page. Using workflows you can connect these robots together to create a database of all search results and sub detail pages.
Search term β‘οΈ Search results β‘οΈ Detail page
Common search result patterns
Standard list results
Pattern: Google-style search results
Search: "data extraction tools" Results: 1. Title | Description | URL 2. Title | Description | URL 3. Title | Description | URL [Pagination: 1 2 3 4 ... Next]
Extraction approach:
Capture List on all result items
Extract title, description, URL
Configure pagination
Grid/Card results
Pattern: E-commerce product search
Search: "wireless headphones" Results: [Product Card] [Product Card] [Product Card] [Product Card] [Product Card] [Product Card] [Load More button]
Extraction approach:
Capture List on product cards
Extract image, name, price, rating
Use "Load More" pagination
Table results
Pattern: Business directory search
Search: "restaurants Austin" Results: | Name | Address | Phone | Rating | Website | | Name | Address | Phone | Rating | Website |
Extraction approach:
Capture List on table rows
Extract all columns
Often has traditional pagination
Mixed content results
Pattern: Knowledge base or documentation
Search: "installation guide" Results: - Articles (title, excerpt, date) - Videos (thumbnail, title, duration) - PDFs (filename, size, download link)
Extraction approach:
Focus on one content type, or
Extract common elements across all types
Handling special scenarios
Dynamic filters and facets
Some sites update results as you filter:
Perform your search first
Apply filters (price range, category, date)
Wait for results to update
Extract the filtered results
Filters become part of your extraction process
Instant search (results as you type)
For sites with live search:
Type your complete search term
Wait for suggestions/results to stabilize
Press Enter for full results page
Search within search
Some sites allow refined searching:
Perform initial search ("electronics")
Use secondary search within results ("laptop")
Extract final filtered results
Scaling search data extraction
Single search β Multiple searches
Once your search robot is trained and approved, you can automatically scrape the results for up to 50,000 search terms at once.
Monitoring search results
Set up monitors to track how results change. You can set up monitors to check for updates, alert you when things change, keep your data up to date, and automatically create a historical database of results.
Popular use cases include:
Daily searches for competitive intelligence
Weekly searches for new content
Monthly searches for market research
Combining with workflows
Use workflows to extract search results and all sub page content.
Robot A: Search and get result URLs
Robot B: Extract detailed data from each URL
Complete dataset from search to details
Advanced techniques
Multi-parameter searches
Train with multiple search inputs:
Keywords: [laptop]
Location: [New York]
All text inputs become input parameters.
Search result variations
Handle different result types:
Sponsored/Ad results β Usually skip
Featured snippets β Capture separately
Regular results β Main extraction
Related searches β Optional capture
Save search URLs
Some sites create shareable search URLs. Instead of training a robot to fill out a search form you can train the robot to extract data directly from this results page.
Example -
site.com/search?q=laptop&sort=price&filter=new
Once you train a robot to do this, you can also generate the Origin URL for up to 50,000 searches and automatically extract the results using Bulk Runs.
