The goal of this project is to access the HTML structure of a particular webpage and extract useful information or data from it using Python. This project focuses on scraping book information from Books to Scrape, a mock online bookstore website.
This project demonstrates the implementation of web scraping in Python. The script scrapes book titles, prices, and availability status from Books to Scrape and saves the extracted data into a CSV file for further analysis.
- Scrapes book titles, prices, and availability status from Books to Scrape.
- Iterates through multiple pages to collect data from the entire catalog.
- Saves the scraped data into a structured CSV file for further use or analysis.
- BeautifulSoup: To parse the HTML content and extract useful information.
- Requests: To fetch the HTML content from web pages.
- Pandas: To store the extracted data and save it into a CSV file.
The output CSV file books_data.csv
will contain data in the following structure:
Title,Price,Availability
A Light in the Attic,£51.77,In stock
Tipping the Velvet,£53.74,In stock
Soumission,£50.10,In stock
...,...,...