Web Scraper Cloud Data Quality and Notification Feature
April 30, 2021
Feature, Web Scraper Cloud, Release, Update
With constant development process and continuous growth, we are happy to have released a new notification system as well as a data quality feature to allow further automatization optionality.
The new update consists of 2 main features:
- Data quality control;
Both of which play an important role in keeping up with your Web Scraper Cloud account and the data returned within it.
Data Quality Control
To minimize the time spent manually adjusting each scraping job or programmatically ensure that extracted information is up to par, the newly released Data Quality feature allows users to predefine expected values of each selector within a sitemap.
Combining the Data Quality feature with the notification system becomes a powerful tool by giving the users the assurance and peace of mind that the predefined data quality criteria meet the quality requirements and in cases, where it does not, allows for quick troubleshooting. Users will be notified via the preferred notification channel of any issues that a scraping job might have encountered as well as the exact selector which no longer functions as expected.
The expected values for each of the selectors are easily predefined and adjusted by moving the modification buttons from left to right, and vice versa.
If the specific scraping job has been executed before, with the “Fill with suggested values” button the Data Quality will match the values of the previous scraping job; therefore, allowing you to skip manually adjusting each data quality field.
When the scraped value matches or exceeds the predefined data quality value, the number above the slider will appear green and when a value does not meet the data quality requirements, the number will be shown in red.
Scraping Jobs Data Quality Panel
A new addition also can be found in the “Jobs” section. We have added another column “Data quality” that will visually inform the user about the data quality for each job.
- Successful job - green;
- Failed jobs - red ;
- Failed with default data quality settings - yellow.
This allows users to, at a quick glance, have an overview of the jobs conducted within their account and whether the data quality for said jobs has met their respected data quality settings.
If any issues have arisen, this can be further inspected by clicking on the data quality icon.
When working with a large number of scraping jobs, it might not be possible to scan through multiple pages of scraping jobs each day to ensure that all jobs have met the data quality criteria, thus we are also adding a notification feature that will automatically send you an in-app or via e-mail.
These settings can be configured, allowing the user to receive them via a preferred channel:
The type of notifications that the user receives:
- Page credits spent;
- Scraping job fails;
- Data Quality does not meet the predefined criteria;
- API misuse notice.
The frequency of the e-mail notifications can also be configured to be received:
- Every time;
- Once an hour;
- Once in 6 hours;
- Once in 12 hours;
- Once a day;
- Once a week.
When a notification is received, it can be seen in the top-right corner of your Web Scraper Cloud account.
By clicking on the notification icon, a drop-down menu will appear showing the latest notification with a brief notification content overview.
From the drop-down, you can either take action on an individual notification or access all notifications by clicking on the “View All Notification” button.
By viewing all notifications, a separate notification page will be presented with a detailed view of all recent notifications.
Receiving notifications via e-mail allows you to be notified of any issues that might have occurred as soon as the job has been completed, without needing to log in to your account to manually check up on the completed jobs.
The e-mail notification has a similar feel to the one sent via the Cloud allowing you to directly inspect the job in questions with the addition of being able to automatically adjust and override the existing data quality configuration for a scraping job if the returned values are still acceptable without having to log into your account to do so.
That is all for now!
Every feature mentioned here is now available within all Web Scraper Cloud accounts and can be utilized to the full extent.
Adjust your Data Quality and notification system!