- This project is going to use Jsoup library to handle HTML extractions with the help of CSS selectors and HTML tags
- The library also has a function that will simplify how we will make HTTP requests
- What Is a Web Crawler? | How Web Spiders Work
- Web Scraping Without Getting Blocked by Anti Scraping Tools
- Crawling, Indexing, and Ranking
- Web Crawling vs. Web Scraping | Oxylabs
- Googlebot: SEO Mythbusting
- What is a search index? [Basic search indexing]
- How Google searches one document among Billions of documents quickly? [In depth search indexing]
- The robots.txt file explained and illustrated
- Robots.txt File [2020 Examples]
- What is a Proxy?
- Proxy Types | Oxylabs [Basic explanation]
- jsoup Java HTML Parser, with the best of HTML5 DOM methods and CSS selectors.
- Jsoup api overview
- //----------------------------------------
- OpenCSV
- Page content is loaded with JavaScript and Jsoup doesn't see it (Stack overflow)
- Parsing JavaScript Generated Page with Jsoup and HtmUnit
- HtmlUnit library
- HtmlUnit vs JSoup
- Getting Started
- Editor.md - Open source online Markdown editor. [This is an online editor for .md file codes, also includes some documentations ]
- Mastering Markdown