Web Scraper Cloud Data Quality and Notification Feature
April 30, 2021
Data quality settings for web scraping, Web Scraper Cloud data quality control, Automated scraping job notifications, Web Scraper Cloud data validation, Scraping job performance tracking, Cloud-based scraping job management, Improve web scraping reliability, Web Scraper Cloud job monitoring tools, Real-time scraping error notifications, Automated web scraping alerts
 
						With a constant development process and continuous growth, we are happy to have released a new notification system as well as a data quality feature to allow further automation optionality.
The new update consists of 2 main features:
- Data quality control;
- Notifications.
Both of which play an important role in keeping up with your Web Scraper Cloud account and the data returned within it.
Data Quality Control
To minimise the time spent manually adjusting each scraping job or programmatically ensure that extracted information is up to par, the newly released Data Quality feature allows users to predefine expected values of each selector within a sitemap.
Combining the Data Quality feature with the notification system becomes a powerful tool by giving the users the assurance and peace of mind that the predefined data quality criteria meet the quality requirements, and in cases where it does not, allows for quick troubleshooting. Users will be notified via the preferred notification channel of any issues that a scraping job might have encountered as well as the exact selector which no longer functions as expected.
The expected values for each of the selectors are easily predefined and adjusted by moving the modification buttons from left to right, and vice versa.
If the specific scraping job has been executed before, with the “Fill with suggested values” button, the Data Quality will match the values of the previous scraping job; therefore, allowing you to skip manually adjusting each data quality field.
When the scraped value matches or exceeds the predefined data quality value, the number above the slider will appear green and when a value does not meet the data quality requirements, the number will be shown in red.
Scraping Jobs Data Quality Panel
A new addition can also be found in the “Jobs” section. We have added another column, “Data quality” that will visually inform the user about the data quality for each job.
- Successful job - green;
- Failed jobs - red ;
- Failed with default data quality settings - yellow.
This allows users to, at a quick glance, have an overview of the jobs conducted within their account and whether the data quality for said jobs has met their respective data quality settings.
If any issues have arisen, this can be further inspected by clicking on the data quality icon.
Notifications
When working with a large number of scraping jobs, it might not be possible to scan through multiple pages of scraping jobs each day to ensure that all jobs have met the data quality criteria; thus, we are also adding a notification feature that will automatically send you an in-app or via e-mail.
These settings can be configured, allowing the user to receive them via a preferred channel:
- Cloud;
- E-mail.
The type of notifications that the user receives:
- Page credits spent;
- Scraping job fails;
- Data Quality does not meet the predefined criteria;
- API misuse notice.
The frequency of the e-mail notifications can also be configured to be received:
- Every time;
- Once an hour;
- Once in 6 hours;
- Once in 12 hours;
- Once a day;
- Once a week.
In-App
When a notification is received, it can be seen in the top-right corner of your Web Scraper Cloud account.
By clicking on the notification icon, a drop-down menu will appear showing the latest notification with a brief notification content overview.
From the drop-down, you can either take action on an individual notification or access all notifications by clicking on the “View All Notifications” button.
By viewing all notifications, a separate notification page will be presented with a detailed view of all recent notifications.
Receiving notifications via e-mail allows you to be notified of any issues that might have occurred as soon as the job has been completed, without needing to log in to your account to manually check up on the completed jobs.
The e-mail notification has a similar feel to the one sent via the Cloud, allowing you to directly inspect the job in question with the addition of being able to automatically adjust and override the existing data quality configuration for a scraping job if the returned values are still acceptable without having to log into your account to do so.
That is all for now!
Every feature mentioned here is now available within all Web Scraper Cloud accounts and can be utilised to the full extent.
Adjust your Data Quality and notification system!
Happy scraping!







