Exploration-of-Youtube-Statistics-Data-using-Hadoop-Technologies
This repository includes 1) The python code to extract data from Youtube using the Youtube API. 2) The R code to clean and merge the datasets downloaded from Kaggle link https://www.kaggle.com/datasnaek/youtube-new 3) Clean the data downloaded using Youtube API 4) Clean and merge the dataset created using Youtube API and Kaggle Youtube dataset. 5) Youtube data analysis using big data yechnologies such as Pig, Hive and Spark.