Scraping E-commerce through Brands

Web Scraper tutorial for e-commerce sites, Extract product titles and prices from websites, Web Scraper text selectors for product information, Step-by-step guide to e-commerce data scraping, Collect product data from dynamic e-commerce pages, Web Scraper best practices for reliable scraping, Web Scraper pagination setup for online stores, Data extraction from e-commerce websites with Web Scraper, Web Scraper sitemap creation for online shop data, How to scrape product data with Web Scraper

Scraping-Through-Brands-E-Commerce-Web-Scraper-Blog

In the previous blog, we explained how to retrieve the necessary data the classical way by going through the categories, sub-categories, then the products, etc. It is the most primitive and intuitive way of gathering data with the Web Scraper extension; however, sometimes a problem arises when the website layout is changed or the placement of various products is altered. For this reason. Previously created sitemaps for that specific website may break or stop working properly. 


However, there is a tip on how to create scrapers for e-commerce websites through the way that in many cases, when a website is changed, the scraper has a lower probability of breaking. On e-commerce sites, you most likely will visibly see a section called “brands”, or “A-Z” or such. Basically, there is a section where you can find products categorised by labels (such as brands). This section will list all the products that are on the page, but also the products will be presented in the same manner/layout, even after the primitive, most visible categories have been differentiated from the previous layouts.

This section is a huge relief for the scraper since, in most cases all the products can be retrieved in an easy and well-functioning manner, since the layouts for each category match.

On the demonstrated website, the section is easy to find.

Scraping-Through-Brands-Scraping-E-Commerce-Web-Scraper-Blog

Scraping-Through-Brands-Scraping-E-Commerce-Web-Scraper-Blog

We start the scraping process as always - by creating a sitemap and designating the starting point - the “Start URL”.

Create-Sitemap-Scraping-E-Commerce-Web-Through-Brands-Web-Scraper-Blog

Create-Sitemap-Scraping-E-Commerce-Web-Through-Brands-Web-Scraper-Blog

Now the process is very similar to the classical scraping; the only difference is that now we select the brand categories.

Scraping-Brands-Web-Scraper-Scraping-E-Commerce-Blog

Scraping-Brands-Web-Scraper-Scraping-E-Commerce-Blog

Then visit the first brand and create the pagination and the product URL selectors.

Selecting-Product-URL-Scraping-E-Commerce-Through-Brands-Web-Scraper-Blog

Selecting-Product-URL-Scraping-E-Commerce-Through-Brands-Web-Scraper-Blog

And lastly, create the three text selectors that will retrieve the titles, the prices, and the colors of the products.

Creating-Product-Selectors-Information-Scraping-E-Commerce-Web-Scraper-Blog

Creating-Product-Selectors-Information-Scraping-E-Commerce-Web-Scraper-Blog

And this is how our selector graph looks like. When we compare it to the one for the classical way of scraping, it is visibly shorter; however, the process does not differ much, there are only fewer steps.

Selector-Graph-Scraping-E-Commerce-Through-Brands-Web-Scraper-Blog

Selector-Graph-Scraping-E-Commerce-Through-Brands-Web-Scraper-Blog

This might not be applied in cases when it is necessary to know from which subcategories the products come. Also, it is very important to check if there are any pagination limits. For example, bigger e-commerce websites, such as, for example, AliExpress, have pagination limits that allow only a limited number of products to be shown, when in reality, it is known that they have thousands of products on their page. If this kind of problem arises, the process of scraping all the products becomes more complicated. Depending on each e-commerce website that has pagination limitations, it is mostly necessary to create a filtering strategy and create scrapers based on those. But, as mentioned previously, it really depends on each individual e-commerce website and the way they display their products. 

Scraping through brands is rather a tip than another method of scraping. It might not guarantee that a created sitemap will never break; however, by scraping through the label page, there is a significantly higher probability that the scraper will not be affected if products of the main categories are altered.










Go back to blog page