Learning-BigData-Hadoop

######################################

######################################

Go to https://www.virtualbox.org and download a version for your OS.(I have tried with Ubuntu)
After download, run the .exe file and install the virtual box on your machine.

Go to https://www.cloudera.com/downloads.html and Click on Download Hortronworks Sandbox link.
Click on Download Hortronworks HDP 'Download Now' button.
Choose installation type as "Virtual Box" in Get Started Now section and click on 'Let's Go' button.
Fill up the details and click on 'Continue' and then 'Submit' Button.
Download the 2.5.0 version of Sandbox HDP Virtualbox Downloads.
Open downloaded .ova file
Click on 'Import'.
Click on 'Start' to start the machine.

You can get the data directly from here: https://github.com/Kavita-Yadav/Learning-Hadoop-and-bigData/tree/master/MovieLensData.
It has detailed description of data from grouplens.
```
 OR
```
Go to https://grouplens.org/ or (Direct link for 100k dataset: https://grouplens.org/datasets/movielens/100k/).
Download MovieLens 100k Dataset by downloading 'ml-100k.zip' data file.
Unzip 'ml-100k.zip'.
You can also try this with 1M, 10M and 20M data from here https://grouplens.org/datasets/movielens/.

Name		Name	Last commit message	Last commit date
Latest commit History 325 Commits
1. HDFS-and-MapReduce		1. HDFS-and-MapReduce
2. Hadoop-with-Pig		2. Hadoop-with-Pig
3. Hadoop-with-Spark		3. Hadoop-with-Spark
4. RelationalDataStoresWithHadoop		4. RelationalDataStoresWithHadoop
5. NonRelationalDataStoresWIthHadoop		5. NonRelationalDataStoresWIthHadoop
6. Query_Engines		6. Query_Engines
7. ClusterManagement		7. ClusterManagement
8. Feeding-data-to-cluster		8. Feeding-data-to-cluster
9. AnalyzingDataStreams		9. AnalyzingDataStreams
Images		Images
MovieLensData		MovieLensData
README.md		README.md

Provide feedback