Skincare Website Scraper

Overview

This project is a web scraper designed to extract product data from four different skincare e-commerce websites. The collected data includes Product Name, Product Description, Product Info, Product Variation Data (if any), Product Image URL, Product Benefits, and Instructions/Application info.

Websites to Scrape

Getting Started

Installation

Clone this repository to your local machine:

git clone https://github.com/yourusername/skincare-scraper.git

Navigate to the project directory:

cd skincare-scraper

Install the required dependencies:

pip install -r requirements.txt

Usage

Run the main script:

python scraper.py

The script will save the collected data at checkpoints to minimize the risk of data loss in case of interruptions.

Data Structure

The scraped data will be stored in a CSV file named skincare_data.csv. Each row corresponds to a product, and columns include Product Name, Product Description, Product Info, Product Variation Data, Product Image URL, Product Benefits, and Instructions/Application info.

Work Outline

Overview

This project aims to scrape product data from four skincare e-commerce websites. The following tasks outline the step-by-step process for achieving this goal.

Tasks

0. Prepare development enviroment

create a virtual enviroment and install dependencies
create a dev branch

1. Identify the Structure

In a notebook, load the page locally and start to parse its elements

2. Scrape Data

Implement the scraping logic to collect Image URL.
Implement the scraping logic to collect Product Name.
Implement the scraping logic to collect Description.
Implement the scraping logic to collect Product Info.
Implement the scraping logic to collect Variation Data.
Implement the scraping logic to collect Image URL.
Implement the scraping logic to collect Benefits.
Implement the scraping logic to collect Instructions.
Implement the scraping logic to collect all products URLs.

3. Organize Data

Store the collected data in a CSV format.

4. Testing

Test your scraping script to ensure it works correctly and captures the necessary information.

5. Save Data at Checkpoints

Implement checkpoints to periodically save the collected data.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
notebooks		notebooks
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Skincare Website Scraper

Overview

Websites to Scrape

Getting Started

Installation

Usage

Data Structure

Work Outline

Overview

Tasks

0. Prepare development enviroment

1. Identify the Structure

2. Scrape Data

3. Organize Data

4. Testing

5. Save Data at Checkpoints

About

Releases

Packages

Languages

License

x110/skincare-website-scraper

Folders and files

Latest commit

History

Repository files navigation

Skincare Website Scraper

Overview

Websites to Scrape

Getting Started

Installation

Usage

Data Structure

Work Outline

Overview

Tasks

0. Prepare development enviroment

1. Identify the Structure

2. Scrape Data

3. Organize Data

4. Testing

5. Save Data at Checkpoints

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages