Skip to main content

How to extract data 'From a list'

Learn how to extract data 'From a list'. This is typically used to scrape data that repeats on a page in a pattern, ex: a list of search results.

M
Written by Melissa Shires
Updated this week

'From a list' is best for extracting repeating information like product listings or search results. By training your robot to scrape or extract the data from a list, it will automatically structure the data into a table as well as trigger pagination options.

Note that when training a robot to extract data from a web page you can train a single robot to extract data "From a list", "Just text", as well as take a screenshot.

Common examples include:

  • Product category pages with multiple items in a grid

  • Search results pages showing multiple listings

  • Directory pages listing multiple businesses

Scraping data 'From a list' is great for extracting and structuring listed data from a page including:

  • Product names

  • Prices

  • Image URLs

  • Star ratings

  • Short descriptions

  • Product URLs

How to scrape data from a page 'From a list'

  1. Click on Capture Text, and select From a list.

  2. Hover over the list of items on the page until you see a dotted outline around the elements you want to capture.

  3. Click to select the list when the outline matches your desired data set.

  4. Robot studio will automatically structure that data into a recommended dataset (you can customize this if needed, see below).

  5. Give your list a descriptive name.

  6. Select the number of items you'd like the robot to capture.

  7. Configure the pagination settings to capture additional list items. These include:

    1. Clicking through 'next' buttons.

    2. Click "load more items"

    3. Infinite scroll (i.e. scroll up or down to load additional items)

    4. No more items to load.

  8. Click 'Save Captured List'.

  9. Click 'Finish' to finish recording your robot if you've captured all of the data you need, or keep capturing text or screenshots.

  10. Name your robot to run it, review the data and approve it.

[arcade id="PiYeOnjDEaRTDTGGEzid" title="How to extract data 'as a list'" padding-bottom="0"]

How to customize what and how the robot structures the list data

If you're not happy with how the robot automatically structured the list data - you can customize it.

  1. From the extracted list, click 'Select Manually Instead'.

  2. Click 'Cancel Edits'.

  3. Hover over each item you'd like to extract, and click to select them.

  4. When finished, click Confirm.

  5. Label each data point (press Enter after each one to move to the next).

  6. Give your list a descriptive name.

  7. Select the number of items you'd like the robot to capture.

  8. Configure the pagination settings to capture additional list items. These include:

    1. Clicking through 'next' buttons.

    2. Click "load more items"

    3. Infinite scroll (i.e. scroll up or down to load additional items)

    4. No more items to load.

  9. Click 'Save Captured List'.

  10. Click 'Finish' to finish recording your robot if you've captured all of the data you need, or keep capturing text or screenshots.

  11. Name your robot to run it, review the data and approve it.

[arcade id="xSu63FmvvmdiXmjKuzgO" title="Extracting data 'From a list' - customizing the list output" padding-bottom="0%"]

Did this answer your question?