Data Science Roadmap
Self learning Data Science curriculum.
About
This repository intendend to provide a complete Data Science learning path to those who intersted in learning Data Science. In this repository, I gave preference to free resource. However, some valuable paid courses also included.
Explanation
- πΊ Video content.
- π΅ Paid content.
- π° Online article.
- π GitHub repo.
Content
- Statistics & Probability
- Linear Algebra
- Python Programming
- Numpy
- Pandas
- To-Do
- Contribution guideline
Statistics & Probability
Descriptive Statistics
Probability
- πΊ Theoretical probability
- πΊ Sample spaces
- πΊ Set operations
- πΊ Addition rule
- πΊ Multiplication rule for independent events
- πΊ Multiplication rule for dependent events
- πΊ Conditional probability and independence
Combinations and Permutations
- πΊ Counting principle and factorial
- πΊ Permutations
- πΊ Combinations
Distributions
- πΊ Normal distribution and the Empirical rule
- πΊ Introduction to Sampling Distributions
- πΊ Sampling distribution of a sample proportion
- πΊ Sampling distribution of a sample mean
Confidence Intervals
Hypothesis
- πΊ Hypothesis Testing
- πΊ Error probabilities and power
- πΊ Tests about a population proportion
- πΊ Tests about a population mean
Linear Algebra
Vectors and Spaces
- πΊ Vectors
- πΊ Linear Combinations and Spans
- πΊ Linear Dependence and Independence
- πΊ Subspaces and the basis for a subspace
Dot Product
Matrix Transformations
- πΊ Functions and Linear Transformations
- πΊ Transformations and Matrix Multiplications
- πΊ Inverse Functions and Transformations
- πΊ Inverses and Determinants
- πΊ Transpose of a Matrix
Eigenvalues and Eigenvectors
Integrals
- πΊ Approximation with Riemann Sums
- πΊ Definite Integrals with Riemann Sums
- πΊ The Fundamental Theorem of Calculus and Accumulation Functions
- πΊ Properties of Definite Integrals
- πΊ The Fundamental Theorem of Calculus and Definite Integrals
- πΊ Reverse Power Rule
- πΊ Indefinite Integrals of Common Functions
- πΊ Definite Integrals of Common Functions
Python Programming
Basics
- π° Hello, World!
- π° Variables and Types
- π° Lists
- π° Basic Operators
- π° String Formatting
- π° Basic String Operations
- π° Conditions
- π° Loops
- π° Functions
- π° Classes and Objects
- π° Dictionaries
- π° Modules and Packages
Advanced
- π° Generators
- π° List Comprehensions
- π° Multiple Function Arguments
- π° Regular Expressions
- π° Exception Handling
- π° Sets
- π° Serialization
- π° Partial functions
- π° Code Introspection
- π° Closures
- π° Decorators
- π° Map, Filter, Reduce
More Resources
- π° Python 3 Tutorial
- πΊ Introduction to Python Video 1 or Video 2
Numpy
Basics
- π° An example
- π° Array Creation
- π° Printing Arrays
- π° Basic Operations
- π° Universal Functions
- π° Indexing, Slicing and Iterating
Shape Manipulation
- π° Changing the shape of an array
- π° Stacking together different arrays
- π° Splitting one array into several smaller ones
Copies and Views
- π° No Copy at All
- π° View or Shallow Copy
- π° Deep Copy
- π° Functions and Methods Overview
Less Basic
- π° Broadcasting rules
Advanced indexing and index tricks
- π° Indexing with Arrays of Indices
- π° Indexing with Boolean Arrays
- π° The ix_() function
- π° Indexing with strings
Linear Algebra
More Resources
- π° NumPy: the absolute basics for beginners
- π° NumPy Tutorial
- π° The Ultimate Beginnerβs Guide to NumPy
- π° The Ultimate NumPy Tutorial for Data Science Beginners
- π° NumPy Tutorial: Your First Steps Into Data Science in Python
- π° 101 NumPy Exercises for Data Analysis (Python)
- πΊ Complete Python NumPy Tutorial
- π Python Numpy Tutorial (with Jupyter and Colab)
- π 100 numpy exercises
Pandas
- π° 10 minutes to pandas
- π° Intro to data structures
- π° Essential basic functionality
- π° IO tools
- π° Indexing and selecting data
- π° MultiIndex / advanced indexing
- π° Merge, join, concatenate and compare
- π° Reshaping and pivot tables
- π° Working with text data
- π° Duplicate Labels
- π° Categorical data
- π° Nullable integer data type
- π° Nullable Boolean data type
- π° Visualization using pandas
- π° Computational tools
- π° Group by: split-apply-combine
- π° Windowing Operations
- π° Time series / date functionality
- π° Time deltas
- π° Styling
- π° Options and settings
- π° Cookbook
More Resources
- π° Learn Pandas Tutorials | Kaggle
- πΊ Python Pandas Tutorial
- πΊ Complete Python Pandas Data Science Tutorial
- π° 101 Pandas Exercises for Data Analysis
- π pandas_exercises
Matplotlib
Matplotlib Official Tutorials
- π° Sample plots in Matplotlib
- π° Customizing Matplotlib with style sheets and rcParams
- π° Styling with cycler
- π° Legend guide
- π° Specifying Colors
- π° Annotations
Other Resources
- π° Introduction to Matplotlib β Data Visualization in Python
- π° Python Plotting With Matplotlib (Guide)
- π° Matplotlib Tutorial
- π° Python Graph Gallery
- πΊ Python Matplotlib Tutorial | Edureka
- πΊ Matplotlib tutorial | Simplilearn
To-Do
- Seaborn
- Exploratory Data Analysis (EDA)
- SQL
- Machine Learning Concepts
- Scikit-Learn
- Projects
- Translation in different language
- Cheatsheets
FAQ
-
Which programming languages should I use? Python and R. However, I added materials on Python.
-
How to contribute? Check out contribution guidelines.
Contribution guideline
You can open an issue and give your suggestions as to how I can improve this guide, or what I can do to improve the learning experience.
You can also fork this repo and send a pull request to fix any mistakes that you have found.
If you want to suggest a new resource, send a pull request adding such resource to the extras section. The extras section is a place where all of us will be able to submit interesting additional articles, books, courses and specializations.