Web Crawler project

This project is going to use Jsoup library to handle HTML extractions with the help of CSS selectors and HTML tags
The library also has a function that will simplify how we will make HTTP requests

Research Info links

Web crawler :

Web crawler videos:

Crawling, Indexing, and Ranking
Web Crawling vs. Web Scraping | Oxylabs
Googlebot: SEO Mythbusting
What is a search index? [Basic search indexing]
How Google searches one document among Billions of documents quickly? [In depth search indexing]

Robots.txt file :

The robots.txt file explained and illustrated
Robots.txt File [2020 Examples]

Proxy :

Proxy videos:

What is a Proxy?
Proxy Types | Oxylabs [Basic explanation]

Libraries:

jsoup Java HTML Parser, with the best of HTML5 DOM methods and CSS selectors.
Jsoup api overview
//----------------------------------------
OpenCSV

//---------------------

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.idea		.idea
src/main		src/main
target/classes		target/classes
.gitattributes		.gitattributes
README.md		README.md
TODO.md		TODO.md
web-crawler.iml		web-crawler.iml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Crawler project

Research Info links

Web crawler :

Web crawler videos:

Robots.txt file :

Proxy :

Proxy videos:

Libraries:

Other links

.Md [Markdown] extension:

About

Releases

Packages

Languages

zick2/web-crawler

Folders and files

Latest commit

History

Repository files navigation

Web Crawler project

Research Info links

Web crawler :

Web crawler videos:

Robots.txt file :

Proxy :

Proxy videos:

Libraries:

Other links

.Md [Markdown] extension:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages