Skip to content

A simple python webscrapping tool with bs4 that navigates to sub pages.

Notifications You must be signed in to change notification settings

VitorCoelho/web-scrapper-w-navigation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

web-scrapper-w-navigation

A simple python webscrapping tool with bs4 that navigates to sub pages. It navigates through each company URL and collects specific information, saving it to a CSV file. This can be useful for gathering data for research or analysis purposes. It pulled next to 200 company's data in less than two minutes.

Features

  • Navigates through each company URL on the URL and extracts relevant information.
  • Collects data such as company name, address, city, state, postal code, phone number, email, and website.
  • Saves the extracted data to a CSV file for further analysis.

Prerequisites

  • Python 3.x installed on your machine.
  • Required Python libraries: requests, beautifulsoup4.

Installation

  1. Clone the repository or download the source code.
  2. Install the required Python dependencies by running the following command:

$ pip install requests beautifulsoup4

Usage

  1. Open the Python script in a text editor.
  2. Update the url_base variable in the code with the URL of the company page containing the list of companies.
  3. Run the Python script by executing the following command:

$ python crawler.py

  1. The script will start scraping the company data by navigating through each company URL and saving it to a CSV file named company_data.csv.

Customization

  • You can modify the CSV file name and the specific fields being extracted by modifying the code in crawler.py.

Contributing

Contributions are welcome! If you find any issues or have suggestions for improvements, please feel free to open an issue or submit a pull request.

License

This project is licensed under the MIT License.

About

A simple python webscrapping tool with bs4 that navigates to sub pages.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages