Seed Sites for Search Engine Web Crawler
Use e.g. https://extract.me/ to unZIP and have fun!
Top 1 Million Sites
http://s3.amazonaws.com/alexa-static/top-1m.csv.zip
http://s3-us-west-1.amazonaws.com/umbrella-static/top-1m.csv.zip
https://ak.quantcast.com/quantcast-top-sites.zip
Top 10 Million Sites
https://www.domcop.com/files/top/top10milliondomains.csv.zip
Use https://pinetools.com/split-files to split zip file into parts smaller than 25MB for github 😄
Later use https://pinetools.com/join-files to join them!