Skip to content

Latest commit

 

History

History
33 lines (20 loc) · 1.84 KB

README.md

File metadata and controls

33 lines (20 loc) · 1.84 KB

Data Science in Apache Spark

Exploring the Global Terrorism Database Dataset

Level: Moderate

Language: Scala

Requirements:

Author: Ian Brooks

Follow LinkedIn - Ian Brooks PhD

Context

Information on more than 150,000 Terrorist Attacks

The Global Terrorism Database (GTD) is an open-source database including information on terrorist attacks around the world from 1970 through 2015 (with annual updates planned for the future). The GTD includes systematic data on domestic as well as international terrorist incidents that have occurred during this time period and now includes more than 150,000 cases. The database is maintained by researchers at the National Consortium for the Study of Terrorism and Responses to Terrorism (START), headquartered at the University of Maryland. More Information

Instructions

  1. Using the provided link, please download Global Terrorism Database CSV file from Kaggle. Note: You will need a Kaggle account.

  2. Using the provided link, please download the Zeppelin Note.

  3. Upload GTDB CSV file.

  4. In Zeppelin, download the Zeppelin Note JSON file. For assistance, please use the following tutorial

License

Unlike all other Apache projects which use Apache license, this project uses an advanced and modern license named The Star And Thank Author License (SATA). Please see the LICENSE file for more information.