Skip to content
/ colly Public
forked from gocolly/colly

Elegant Scraper and Crawler Framework for Golang

License

Notifications You must be signed in to change notification settings

lingdor/colly

This branch is 1 commit ahead of, 17 commits behind gocolly/colly:master.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

f1303d0 Â· Sep 17, 2023
Jun 19, 2023
Jan 5, 2023
Jun 29, 2021
Apr 11, 2019
Sep 17, 2023
May 13, 2020
Sep 17, 2023
Mar 13, 2018
Jan 6, 2018
Jun 13, 2020
Aug 20, 2021
Sep 29, 2017
Dec 17, 2021
Jun 8, 2020
Sep 17, 2023
Sep 17, 2023
Mar 11, 2018
Mar 11, 2018
Mar 7, 2023
Mar 7, 2023
Oct 24, 2019
Sep 17, 2023
Oct 19, 2019
Oct 19, 2019
Mar 8, 2022
Jan 29, 2020
Jan 16, 2023
Feb 14, 2019
Feb 14, 2021
Dec 10, 2019

Repository files navigation

Colly

Lightning Fast and Elegant Scraping Framework for Gophers

Colly provides a clean interface to write any kind of crawler/scraper/spider.

With Colly you can easily extract structured data from websites, which can be used for a wide range of applications, like data mining, data processing or archiving.

GoDoc Backers on Open Collective Sponsors on Open Collective build status report card view examples Code Coverage FOSSA Status Twitter URL

Features

  • Clean API
  • Fast (>1k request/sec on a single core)
  • Manages request delays and maximum concurrency per domain
  • Automatic cookie and session handling
  • Sync/async/parallel scraping
  • Caching
  • Automatic encoding of non-unicode responses
  • Robots.txt support
  • Distributed scraping
  • Configuration via environment variables
  • Extensions

Example

func main() {
	c := colly.NewCollector()

	// Find and visit all links
	c.OnHTML("a[href]", func(e *colly.HTMLElement) {
		e.Request.Visit(e.Attr("href"))
	})

	c.OnRequest(func(r *colly.Request) {
		fmt.Println("Visiting", r.URL)
	})

	c.Visit("http://go-colly.org/")
}

See examples folder for more detailed examples.

Installation

Add colly to your go.mod file:

module github.com/x/y

go 1.14

require (
        github.com/gocolly/colly/v2 latest
)

Bugs

Bugs or suggestions? Visit the issue tracker or join #colly on freenode

Other Projects Using Colly

Below is a list of public, open source projects that use Colly:

If you are using Colly in a project please send a pull request to add it to the list.

Contributors

This project exists thanks to all the people who contribute. [Contribute].

Backers

Thank you to all our backers! 🙏 [Become a backer]

Sponsors

Support this project by becoming a sponsor. Your logo will show up here with a link to your website. [Become a sponsor]

License

FOSSA Status

About

Elegant Scraper and Crawler Framework for Golang

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Go 99.4%
  • HTML 0.6%