• Stars
    star
    1
  • Language
    Java
  • Created over 5 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Analyzed Crime Big data of Washington DC to solve the following business queries: > Which hour has the highest crime count? > Which shift has the highest crime count? > Year wise crime count > Hour wise crime count > Crime count by an offense > Average of Shift wise crime count The data was initially stored in MySql which was then moved to HDFS using SQOOP, from where 4 MapReduce operations are doing using JAVA in Eclipse IDE. The outputs of the queries are then moved to HBase using SQOOP. Two more MapReduce operations are done using PIG, the output of which is also moved to HBase using SQOOP. All the outputs were then moved to the local system and are visualized using RStudio and Tableau. Tools used: > MySQL, HDFS and HBase to store the data > SCOOP to move the data from one database to another > JAVA (Eclipse IDE) and PIG to run the MapReduce queries > RStudio for data pre-processing and visualization > Tableau for visualization > LATEX for Documentation