Skip to main content

How to extract data based on a search query

Search-based extraction lets you find and extract specific data by using a website's search functionality.

M
Written by Melissa Shires
Updated today

This guide focuses on training robots to search, handle results, and extract exactly what you need from search result pages.

Understanding search-based extraction

Search-based extraction involves three key steps:

  1. Entering a search query into a website's search box

  2. Retrieving the results that match your query

  3. Extracting the data from those search results

This is different from browsing directly to a page - you're using the site's own search to filter and find specific information.

When to use search extraction

Use search extraction for:

Use direct navigation when:

Finding specific products in large catalogs

You have direct URLs to pages

Researching companies in directories

Content is on static pages

Locating articles or documents

You need all items, not filtered results

Filtering data by criteria

The site has no search function

Accessing data that's only findable via search

You're monitoring specific pages

Training your search robot

Step 1: Start with your search

  1. Start training your robot and navigate to the website with the search function.

  2. Locate the search box (usually in header or prominently displayed).

  3. Click into the search field to focus it

  4. Type your search term (e.g., "laptop")

    • Type naturally, don't paste.

    • You can train a robot to use multiple search fields if needed.

    • This creates an input parameter automatically.

  5. Press Enter or click Search to submit

πŸ’‘ Your robot will be trained to follow the same steps you take so fill out the search naturally as if you're searching for this information.

Step 2: Wait for results to load

Let the results fully load before proceeding. Watch for:

  • Loading spinners to disappear

  • Result count to appear ("Showing 1-50 of 235 results")

  • All result elements to render

  • Any filters or sorting options to load

Step 3: Extract your data

Before extracting, understand what you're working with so you can figure out how to structure and extract the data you need.

Result type

When to use

Extraction method

What to extract

Multiple results

Lists of search results
(most common)

β€’ Title/Name
β€’ Description
β€’ Price
β€’ Link to detail page
β€’ Ratings
β€’ Any visible data

Single result

One featured result
or summary data

β€’ Result count
β€’ Best match
β€’ Featured snippet
β€’ Summary stats

Mixed results

Combination of lists
and single elements

Both.

β€’ List: Main results
β€’ Text: Result count
β€’ Text: Filters applied

πŸ’‘ You can also train a second robot to extract details from a detail page. Using workflows you can connect these robots together to create a database of all search results and sub detail pages.

Search term ➑️ Search results ➑️ Detail page

Common search result patterns

Standard list results

Pattern: Google-style search results

Search: "data extraction tools" Results: 1. Title | Description | URL 2. Title | Description | URL   3. Title | Description | URL [Pagination: 1 2 3 4 ... Next]

Extraction approach:

  • Capture List on all result items

  • Extract title, description, URL

  • Configure pagination

Grid/Card results

Pattern: E-commerce product search

Search: "wireless headphones" Results: [Product Card] [Product Card] [Product Card] [Product Card] [Product Card] [Product Card] [Load More button]

Extraction approach:

  • Capture List on product cards

  • Extract image, name, price, rating

  • Use "Load More" pagination

Table results

Pattern: Business directory search

Search: "restaurants Austin" Results: | Name | Address | Phone | Rating | Website | | Name | Address | Phone | Rating | Website |

Extraction approach:

  • Capture List on table rows

  • Extract all columns

  • Often has traditional pagination

Mixed content results

Pattern: Knowledge base or documentation

Search: "installation guide" Results: - Articles (title, excerpt, date) - Videos (thumbnail, title, duration) - PDFs (filename, size, download link)

Extraction approach:

  • Focus on one content type, or

  • Extract common elements across all types

Handling special scenarios

Dynamic filters and facets

Some sites update results as you filter:

  1. Perform your search first

  2. Apply filters (price range, category, date)

  3. Wait for results to update

  4. Extract the filtered results

  5. Filters become part of your extraction process

Instant search (results as you type)

For sites with live search:

  1. Type your complete search term

  2. Wait for suggestions/results to stabilize

  3. Press Enter for full results page

Search within search

Some sites allow refined searching:

  1. Perform initial search ("electronics")

  2. Use secondary search within results ("laptop")

  3. Extract final filtered results

Scaling search data extraction

Single search β†’ Multiple searches

Once your search robot is trained and approved, you can automatically scrape the results for up to 50,000 search terms at once.

Monitoring search results

Set up monitors to track how results change. You can set up monitors to check for updates, alert you when things change, keep your data up to date, and automatically create a historical database of results.

Popular use cases include:

  • Daily searches for competitive intelligence

  • Weekly searches for new content

  • Monthly searches for market research

Combining with workflows

Use workflows to extract search results and all sub page content.

  1. Robot A: Search and get result URLs

  2. Robot B: Extract detailed data from each URL

  3. Complete dataset from search to details

Advanced techniques

Multi-parameter searches

Train with multiple search inputs:

Keywords: [laptop] 
Location: [New York]

All text inputs become input parameters.

Search result variations

Handle different result types:

  • Sponsored/Ad results β†’ Usually skip

  • Featured snippets β†’ Capture separately

  • Regular results β†’ Main extraction

  • Related searches β†’ Optional capture

Save search URLs

Some sites create shareable search URLs. Instead of training a robot to fill out a search form you can train the robot to extract data directly from this results page.

Example -

site.com/search?q=laptop&sort=price&filter=new

Once you train a robot to do this, you can also generate the Origin URL for up to 50,000 searches and automatically extract the results using Bulk Runs.

Did this answer your question?