web-scraping-with-beautiful-soup

https://www.linkedin.com/posts/medha-agarwal-01b33725a_internship-pythonprogramming-webscraping-activity-7214991432367976448-8iL1?utm_source=share&utm_medium=member_desktop

𝗗𝗲𝘀𝗰𝗿𝗶𝗽𝘁𝗶𝗼𝗻: 𝗪𝗲𝗯𝘀𝗶𝘁𝗲 𝗦𝗲𝗹𝗲𝗰𝘁𝗶𝗼𝗻: Chose BigBasket, a website with publicly accessible product listings. 𝗗𝗮𝘁𝗮 𝗘𝘅𝘁𝗿𝗮𝗰𝘁𝗶𝗼𝗻: Used the Beautiful Soup library to scrape HTML content and extract relevant information such as product titles, prices, quantities, and discounts. 𝗗𝗮𝘁𝗮 𝗦𝘁𝗼𝗿𝗮𝗴𝗲: Stored the extracted data in a structured format (CSV file) for further analysis and use. 𝗧𝗲𝗰𝗵𝗻𝗶𝗰𝗮𝗹 𝗖𝗵𝗮𝗹𝗹𝗲𝗻𝗴𝗲𝘀: Handled issues like dynamic content loading, ensuring accurate and complete data extraction.

𝗜𝗺𝗽𝗹𝗲𝗺𝗲𝗻𝘁𝗮𝘁𝗶𝗼𝗻 𝗛𝗶𝗴𝗵𝗹𝗶𝗴𝗵𝘁𝘀:

Utilized Selenium for navigating and interacting with the dynamic website.
Leveraged Beautiful Soup for parsing HTML content and extracting product details.
Implemented a scrolling mechanism to handle infinite scrolling and ensure all products were captured.
Ensured data integrity by handling missing or unavailable data gracefully.

𝗖𝗵𝗮𝗹𝗹𝗲𝗻𝗴𝗲𝘀 𝗙𝗮𝗰𝗲𝗱:

Managing dynamic content loading and ensuring the scraper captures all products as the page scrolls.
Handling website structure changes and ensuring the scraper adapts accordingly.
Optimizing the scraper to efficiently process and store large amounts of data.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
product_details.csv		product_details.csv
scrape.py		scrape.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

web-scraping-with-beautiful-soup

About

Releases

Packages

Languages

medss19/web-scraping-with-beautiful-soup

Folders and files

Latest commit

History

Repository files navigation

web-scraping-with-beautiful-soup

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages