Skip to content

CarlosLopezRoa/SHAN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 

Repository files navigation

SHAN

## Scalable Search and Web Crawling

The objective of this work was to take the concepts of information retrieval to implement a scalable framework for the general task of indexing unstructured documents and retrieve them from the web. Our case study was to take Wikipedia data as crawlable and indexable target. After crawling and indexing, a GUI, deployed in the cloud, displays the results and allows the user to do personalised queries. Shan (山) is the chinese character for mountain. It can also be composed concatenating the first letter of the components: Solr Hadoop Apache Nutch.

About

Scalable Search and Web Crawling

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages