Web Scraper To Generate Product Database from eCommerce Site and Exporting to .xlsx Format

This project is aimed to implement basic web scraping using Python's BeautifulSoup library to create an informative dataset of available products. The e-commerce website targetted in the notebook is laptopsdirect.uk, with primary focus on the available laptops being sold. The project is divided into two parts - the first part scrapes only the first result page of the concerned product while the second part fetches all the available results from multiple pages. The scraped data is formatted using pandas library and exported in .xlsx format.

The following data points were fetched in this project:

Product Name
Price
Rating
Review Count
Product Details
Relative URL

Libaries Required

The project requires BeautifulSoup, requests, pandas and urllib Python library toolkits to be installed.

To install the libraries, the following lines of commands can be used in Command Prompt
pip install beautifulsoup4
pip install pandas

Note: If you are using Anaconda Distribution, pandas and urllib will pre-installed with the package

Importing the Libraries

from bs4 import BeautifulSoup           
import requests
import pandas as pd
import urllib.parse

Using the Scraper

The notebook is divided into two parts corresponding to whether one wishes to scrape a single page of results or all of them. Part 1 is meant for scraping the first page of the targeted result site and Part 2 is attributed for all pages. You have to run each cell starting from the library imports and move down sequentially from your desired Part.

Exporting the Results

We utilise the pandas library to format all fetched results into a data frame. In this notebook, the results were exported to a MS-Excel file (.xlsx format) using the followung lines of code:

product_overview.to_excel("ResultAll.xlsx", index = False)

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LICENSE		LICENSE
LaptopsDirect Scraper.ipynb		LaptopsDirect Scraper.ipynb
README.md		README.md
ResultAll.xlsx		ResultAll.xlsx
ResultSingle.xlsx		ResultSingle.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Scraper To Generate Product Database from eCommerce Site and Exporting to .xlsx Format

Libaries Required

Importing the Libraries

Using the Scraper

Exporting the Results

Improvements in the code and enhancement of the notebook are always welcome! ❤

About

Releases

Packages

Languages

License

anubhab1710/LaptopsDirect-Data-Scraper

Folders and files

Latest commit

History

Repository files navigation

Web Scraper To Generate Product Database from eCommerce Site and Exporting to .xlsx Format

Libaries Required

Importing the Libraries

Using the Scraper

Exporting the Results

Improvements in the code and enhancement of the notebook are always welcome! ❤

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages