Data Algorithms Book
- Author: Mahmoud Parsian ([email protected])
- Title: Data Algorithms: Recipes for Scaling up with Hadoop and Spark
- This GitHub repository will host all source code and scripts for Data Algorithms Book.
- Publisher: O'Reilly Media
- Published date: July 2015
Git Repository
The book's codebase can also be downloaded from the git repository at:
git clone https://github.com/mahmoudparsian/data-algorithms-book.git
2nd Edition! Coming Out @ the End of 2021
Upgraded to Spark-3.1.2
Production Version is Available NOW!
Java 8's LAMBDA Expressions to Spark...
Scala Spark Solutions
How To Build using Apache's Ant
How To Build using Apache's Maven
Machine Learning Algorithms using Spark
Spark for Cancer Outlier Profile Analysis
Webinars and Presentions on Data Algorithms
Introduction to MapReduce
Bonus Chapters
Author Book Signing
How To Run Spark/Hadoop Programs
Submit a Spark Job from Java Code
How To Run Python Programs
To run python programs just call them with spark-submit
together with the arguments to the program.
My favorite quotes...
Questions/Comments
- View Mahmoud Parsian's profile on LinkedIn
- Please send me an email: [email protected]
- Twitter: @mahmoudparsian
Thank you!
best regards,
Mahmoud Parsian