This project requires an installation of Docker, Docker Compose and Serverless Offline to run locally.
This is a web scraper which scrapes Massachusetts housing data from the Massachusetts Assessor’s Online Database using cheerio.
Currently, this web scraper only prints out the heating data of the household. The goal of this project is to collect as much information as possible, and record that in a Mongo database, so it can then be publicly accessible as an API.
The web scraper works by going to the Massachusetts Assessor’s Online Database and accesses every available town listed. It then goes to each town's individual page, and goes through every street from A to Z. From there, it goes through each letter, and accesses its streets. From the street, we can see what homes are hosted on that street.
The page is not well-formatted (different spelling/inconsistent tables, missing data, etc.).
Additionally, it is extremely slow to traverse page-by-page and collecting the proper data.