Scraping E-commerce through Brands

Data, E-commerce, Tutorial

Scraping-Through-Brands-E-Commerce-Web-Scraper-Blog

In the previous blog, we explained how to retrieve the necessary data the classical way by going through the categories, sub-categories, then the products, etc. It is the most primitive and intuitive way of gathering data with Web Scraper extension; however, sometimes a problem arises when the website layout is changed or the placement of various products is altered. For this reason. previously created sitemaps for that specific website may break, or stop working properly. 


However, there is a tip of how to create scrapers for e-commerce websites through the way that in many cases when a website is changed, the scraper has a lower probability of breaking. On e-commerce sites, you most likely will visibly see a section called “brands”, or “A-Z” or such. Basically a section on which you can find products categorized by labels (such as brands). This section will list all the products that are on the page, but also the products will be presented in the same manner/layout even after the primitive, most visible categories have been differentiated from the previous layouts.

This section is a huge relief for the scraper since in most cases all the products can be retrieved in an easy and well-functioning manner since the layouts for each category match.

On the demonstrated website, the section is easy to find.

Scraping-Through-Brands-Scraping-E-Commerce-Web-Scraper-Blog

Scraping-Through-Brands-Scraping-E-Commerce-Web-Scraper-Blog

We start the scraping process as always - by creating a sitemap and designating the starting point - the “Start URL”.

Create-Sitemap-Scraping-E-Commerce-Web-Through-Brands-Web-Scraper-Blog

Create-Sitemap-Scraping-E-Commerce-Web-Through-Brands-Web-Scraper-Blog

Now the process is very similar to the classical scraping, the only difference is that now we select the brand categories.

Scraping-Brands-Web-Scraper-Scraping-E-Commerce-Blog

Scraping-Brands-Web-Scraper-Scraping-E-Commerce-Blog

Then visit the first brand and create the pagination and the product URL selectors.

Selecting-Product-URL-Scraping-E-Commerce-Through-Brands-Web-Scraper-Blog

Selecting-Product-URL-Scraping-E-Commerce-Through-Brands-Web-Scraper-Blog

And lastly, create the three text selectors that will retrieve the titles, the prices, and the colors of the products.

Creating-Product-Selectors-Information-Scraping-E-Commerce-Web-Scraper-Blog

Creating-Product-Selectors-Information-Scraping-E-Commerce-Web-Scraper-Blog

And this is how our selector graph looks like. When we compare it to the one for the classical way of scraping, it is visibly shorter; however, the process does not differ much, there are only fewer steps.

Selector-Graph-Scraping-E-Commerce-Through-Brands-Web-Scraper-Blog

Selector-Graph-Scraping-E-Commerce-Through-Brands-Web-Scraper-Blog

This might not be applied in cases when it is necessary to know from which subcategories the products come from. Also, it is very important to check if there are any pagination limits. For example, bigger e-commerce websites, such as, for example, AliExpress, there are pagination limits that allow only a limited number of products to be shown, when in reality, it is known that they have thousands of products on their page. If this kind of problem arises, the process of scraping all the products becomes more complicated. Depending on each e-commerce website that has pagination limitations, mostly it is necessary to create a filtering strategy and create scrapers based on those. But, as mentioned previously, it really depends on each individual e-commerce website and the way they display their products. 

Scraping through brands is rather a tip than another method of scraping. It might not guarantee that a created sitemap will never break; however, by scraping through the label page, there is a significantly higher probability that the scraper will not be affected if products of the main categories will be altered.










Go back to blog page