How do I handle pagination and scrolling?
When scraping large amounts of data from websites, you'll often encounter pagination, where the data is spread across multiple pages. Browse AI makes handling pagination a breeze, allowing you to easily extract data from extensive lists and tables.
Understanding Pagination 📃
Pagination is a common technique used by websites to display large amounts of data in smaller, more manageable chunks. It typically involves dividing the data into multiple pages and providing navigation elements (like "Next Page" buttons or numbered links) to access those pages.
Browse AI supports several pagination methods to accommodate different website designs:
Click on "next" to navigate to the next page: This method involves clicking a button or link that clearly indicates the next page, such as a "Next" button or an arrow pointing to the right:
Click on "load more" to load more items: This method involves clicking a button that loads more items onto the current page without navigating to a separate page.
Scroll down to load more items: This method involves scrolling down the page to trigger the loading of more items. This is common on websites with infinite scrolling.
Scroll up to load more items: Similar to scrolling down, this method involves scrolling up to load more items. This is less common but can be found on some websites.
- No more items to load: This option indicates that there are no more items to load on the current page or in the entire list.
Browse AI's Pagination Handling
Browse AI offers intuitive ways to handle pagination, whether you're using the Chrome Extension or Robot Studio:
Using the Chrome Extension
- When building your robot, use the "Capture List" action to select the list of items you want to extract. (See: Capture List vs. Capture Text)
- Once you have labelled the last selection or your last data point, a summary list will appear to review.
Name your list, and configure how many rows of data you want to extract:
- Choose the method that matches how the website displays its pagination (E.g., Scroll down, Next page button, Load more button, etc.).
- If you choose the "Next page" or "Load more" pagination method, carefully select the correct button or link that advances to the next page.
- Make sure to choose the element or arrow that dynamically moves to the next page, not just a static link to page 2.
- Once you've configured the pagination, click "Capture List" button on the lower right of the preview window to complete the action.
Using Robot Studio
Handling pagination in Robot Studio is similar to the Chrome Extension.
- You can opt to manually capture your list items, or let Robot Studio automatically detect the data points on the page for you. (See: How Do Recommended Datasets Work in Robot Studio?)
- Select the pagination method:
✅ In many cases, Robot Studio will automatically detect the correct pagination method for you. Simply review the suggested method and confirm it if it's accurate.
- ❎ If Robot Studio doesn't automatically detect the pagination, or if the suggested method is incorrect, you'll need to manually select the appropriate method from the options provided.
You can also check out our YouTube video on how to handle paginations on Robot Studio 👇
If you're unsure which pagination method to choose for a specific website, or if you encounter a website with a pagination method not listed here, please don't hesitate to contact our support team. We're happy to help you troubleshoot any issues and ensure your robots are extracting data correctly.
🤖💪