Pagination selector is used to navigate through all pagination pages or to load
all items with the Load more
button. Pagination selector is always recursive,
so all pagination pages are discovered. To extract data from pagination pages,
data extraction selectors have to be set as child selectors for pagination
selector.
href
attribute.href
or
onclick
attribute.For example, an e-commerce site has multiple categories. Each category has a
list of items and pagination links. Some pages are not directly available
from the category but are available from pagination pages (pagination links 1-5
are visible but 6-8 are not). You can start by building a sitemap that
visits each category and extracts items from the category page. This sitemap
will extract items only from the first pagination page. To extract items from
all pagination links including the ones that are not visible at the beginning
you need to create a Pagination selector
that selects the pagination links.
Figure 1 shows how the pagination selector should be created in the sitemap.
When the scraper opens a category link, it will extract items that are available
on the page. After that, it will find the pagination links and also extract
data from those. Figure 2 shows a selector graph where you can see how
pagination links discover more pagination links and more data.