Web Scraper 0.6.0 - Pagination Selector and More!

Feature, Web Scraper Cloud, Release, Tutorial, Pagination

Web-Scraper-0.6.0-release-blog-image

The moment has come that we are ready to release the most requested feature of all time! Creating pagination correctly has been one of the most frequently reported problems; however, that is about to change. The pagination selector is here and so are other additional updates that will make scraping with Web Scraper easier.


What have we brought:

  • Pagination selector;
  • Automatized attribute element selector;
  • Element count of data preview;
  • Additional data preview;
  • Cloud synchronization;
  • Sitemap search bar.

Pagination selector.

The Pagination selector is here for good. After a long period of trial and error, the time has come. No longer is there a need to contemplate which selector to use when creating pagination for a sitemap. From now on, no need to split heads, because there is one and only selector to use for pagination - the pagination selector.

Here is how it works:

When creating a selector, there is a specific selector now called “pagination”. 

pagination-selector-amazon-example-new-release-blog

pagination-selector-amazon-example-new-release-blog

Once you have selected that, now the pagination will work just by selecting the pagination buttons.

pagination-selector-buttons-beneath-amazon-example-new-release-blog

pagination-selector-buttons-beneath-amazon-example-new-release-blog

Important to note that:

  • If previously it was necessary to also designate the pagination selector as the child selector of its own, now the process is automatic. 
  • And because of that, it is crucial that when you begin creating a sitemap - the pagination selector is created first, and the next selectors created under the first pagination. The selector tree looking accordingly:
pagination-selector-tree-example-new-release-blog

pagination-selector-tree-example-new-release-blog

Also, important to mention that the pagination detection is automatic; however, for individual cases and extra flexibility, the pagination can be specified from the “Pagination Type” drop-down menu.

pagination-beta-drop-down-selector-menu-new-release-blog

pagination-beta-drop-down-selector-menu-new-release-blog

Automatized attribute selectror.

Previously, to correctly select the element attribute, the HTML of the page had to be 'Inspected' to find the attribute you were after. However, for many it is not first-hand knowledge of what is needed to be looked at and copied; therefore, now element attribute selector will automatically derive all the possible attributes from the selected element and all you have to do is select which one is necessary for you.  

The process of selecting the elements is as it was previously.

previous-process-element-selection-automatized-attribute-selector

previous-process-element-selection-automatized-attribute-selector

Before it was necessary to manually write the attribute name which you are looking to extract.

However, with the new update, once the elements have been selected, a drop-down menu of all the available attributes included in the element is displayed. Click on the necessary one and done. And if you are not sure of which attribute you should select, by using the ‘Data preview’ you are able to check if the correct information is being returned.

automatized-element-attribute-selector-example-new-release-blog

automatized-element-attribute-selector-example-new-release-blog

Data preview.

The 'Data preview' function has also undergone some 'nice-to-have’ changes. Now once you select the necessary elements and click on the element preview, on the left side above the developers’ tool, you will be able to see how many elements have been selected. This helps to quality control your scraping jobs. If, for example, you are scraping larger websites and you know that there should be quite a lot of elements for the created selector, but it only shows a few, then that is a sign that maybe the selector is a bit faulty.

new-data-preview-element-count-feature-new-release-blog

new-data-preview-element-count-feature-new-release-blog

Additional data preview.

An additional feature that allows you to easily check if your selectors are accurate and working is the selector data preview button. It is an additional button on the upper right corner of the developer’s tool. 

additional-data-preview-button-new-release-blog

additional-data-preview-button-new-release-blog

The 'Data preview' button now allows you to quickly inspect whether your selectors are working correctly across multiple items when creating sitemaps. For example, when extracting data from an item page you have created multiple selectors and need to make sure all selectors work the same across all items - with the 'Data preview' button, this can be done using a single click. Instead of previewing each selector individually, which can be quite time-consuming if the created selector list is quite long. This way, you can navigate to each product, click the 'Data preview' button and quickly troubleshoot any issues, if some selectors are working correctly for some items, but not returning any data for others.

under-link-selectors-additional-data-preview-feature-new-release-blog

under-link-selectors-additional-data-preview-feature-new-release-blog

Like in this situation, when clicking the data preview button, a table of all the selected data for all the selectors under the specific link will be displayed. Just like this:

data-preview-under-link-data-additional-preview-feature-new-release-selector-blog

data-preview-under-link-data-additional-preview-feature-new-release-selector-blog

Cloud sync.

Another grand addition to the release is the Cloud synchronization. Now it is possible to sync your sitemaps with your cloud account. With a simple click on the “Cloud Sync” on the upper right corner of the developer tool, the sitemaps get synced in a few seconds. 

The days of needing to copy/paste sitemaps between the extension and the Cloud to make updates to existing sitemaps are over! Now, this can be done automatically by making your edits locally and syncing the newly edited sitemap with the Cloud.

One way to sync your Cloud sitemaps is to click on the “Log in to Cloud” button on the upper right side of the developers’ tool. Another way to connect the two is by logging into your Web Scraper Cloud account and opening the extension. The website should then showcase the connection section from which you can press “Connect to cloud with extension”.

connect-cloud-with-extension-feature-button-new-release-blog

connect-cloud-with-extension-feature-button-new-release-blog

At any given moment it is possible to disconnect one from another either by extension or through the Cloud.

reconnect-cloud-with-extension-button-feature-new-release-blog

reconnect-cloud-with-extension-button-feature-new-release-blog

Once the synchronization has happened, from the developers’ tool you will be able to

  • Delete sitemaps locally, 
  • Download sitemaps from the Cloud,
  • Upload sitemaps to the Cloud,
  • And if synced but edited - to overwrite the sitemap.
download-overwrite-delete-locally-feature-buttons-new-release-blog

download-overwrite-delete-locally-feature-buttons-new-release-blog

Sitemap search bar.

Last but not least, for easier and more convenient extension use, we have added a sitemap search tool. No more countless scrolling through your sitemap list to find the one you are looking for. Now it’s as simple as searching for the sitemap by starting to write the sitemap’s name in the search bar tool.

sitemap-search-bar-feature-example-new-release-blog

sitemap-search-bar-feature-example-new-release-blog

That is all for now. Update the extension and try out the new updates for yourself!

Happy scraping!



Go back to blog page