Skip to content

victormartinez/ferret

Repository files navigation

ferret-logo

Quick Start

The library is pretty straightforward to be used:

from ferret.main import Ferret

ferret = Ferret(url='http://g1.globo.com/politica/blog/cristiana-lobo/post/setor-de-propina-da-odebrecht-movimentou-us-33-bi-diz-delator.html')

ferret.get_article()

Ferret also takes two optional arguments: HTML and/or language.

from ferret.main import Ferret

ferret = Ferret(url='http://g1.globo.com/politica/blog/cristiana-lobo/post/setor-de-propina-da-odebrecht-movimentou-us-33-bi-diz-delator.html', html='<html><head></head><body><h1>Título da página de notcias</h1></body></html>', lang='pt')

ferret.get_article()

Contribute

License

Authored and maintained by Victor Martinez.

Ferret uses a small routine to inherited from python-goose's code. See their license here.

Credits to Logo

The image contains copyright to skaterjob at Vecteezy.

About

A modern pythonic lib to extract data from news pages

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published