Skip to content

yasszu/finagle-web-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Aug 14, 2018
3a6cba6 · Aug 14, 2018

History

46 Commits
Apr 30, 2018
Mar 4, 2018
Aug 14, 2018
Apr 29, 2018
Feb 11, 2018
Aug 14, 2018
Jul 21, 2018
Apr 30, 2018

Repository files navigation

Web Crawler with Finagle

Scrape below blogs:

Getting Started

CREATE DATABASE IF NOT EXISTS `crawler` 
DEFAULT CHARACTER SET = utf8mb4
COLLATE utf8mb4_unicode_ci;

Run application

$ sbt 'run-main app.Server -db.host localhost'

Run with Docker

$ sbt docker:publishLocal
$ docker-compose build
$ docker-compose up

MySQL container shell

$ docker exec -it finagle-web-crawler_db_1 bash

Refs

Deploy fat JAR

  • Create a JAR file
$ sbt assembly

  • Run process
$ java -jar target/scala-2.12/finagle-web-crawler-assembly-1.0-SNAPSHOT.jar -db.host='localhost'

Feed

GET feed/googleblog/developers

  • Example
$ curl -X GET 'http://localhost:8080/feed/googleblog/developers'

GET feed/googleblog/developers_jp

  • Example
$ curl -X GET 'http://localhost:8080/feed/googleblog/developers_jp'

API

GET api/googleblog/developers

  • Example
$ curl -X GET 'http://localhost:8080/api/googleblog/developers?count=5&page=0'

GET api/googleblog/developers_jp

  • Example
$ curl -X GET 'http://localhost:8080/api/googleblog/developers_jp?count=5&page=0'

GET api/developers/android

  • Example
$ curl -X GET 'http://localhost:8080/api/developers/android?count=5&page=0'

Execute scraping by manual

GET scrape/googleblog/developers

  • Example
$ curl -X GET 'http://localhost:8080/scrape/googleblog/developers'

GET scrape/googleblog/developers_jp

  • Example
$ curl -X GET 'http://localhost:8080/scrape/googleblog/developers_jp'

GET scrape/googleblog/android

  • Example
$ curl -X GET 'http://localhost:8080/scrape/googleblog/android'

About

Web Crawler of Google Developers Blog with Finagle

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages