Skip to content

ferxohn/document_segmentation

Repository files navigation

Document Segmentation

By: Fernando Gomez Perera

Academic project where I needed to segment approx. 1k academic documents obtained by web scraping techniques with BeautifulSoup4 in Python. The document segmentation process was developed using K-Means and NLP techniques with Scikit-Learn and spaCy in Python.

The project is written in Spanish.