Skip to content

Implemented a full-text search engine, the content-based ranking and the page rank algorithms.

Notifications You must be signed in to change notification settings

mandymchu/search-engine

Repository files navigation

Search Engine and Page Rank

Applied PageRank algorithm in a full-text search engine, which allow people to search a collection of up to 100,000 pages for a list of words, and which rank results according to how relevant the documents are to those words.

• Built a crawler to collect documents and follow links to others. (Beautiful Soup, urllib used)

• Set up a database for building the full-text index. The index is a list of all the different words along with the documents and their locations in the documents. (SQLite, sqlite3 used)

• Returned a ranked list of documents from a query. Implemented the content-based ranking and the PageRank algorithm.

• Created a neural network to change the ordering of results, which learns to associate searches with results based on people’s clicking habits.

About

Implemented a full-text search engine, the content-based ranking and the page rank algorithms.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages