• Stars
    star
    3,456
  • Rank 12,389 (Top 0.3 %)
  • Language
    Jupyter Notebook
  • Created about 2 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A collection of scientific methods, processes, algorithms, and systems to build stories & models. Whether you are a fresher in the field or an experienced professional who wants to transition into Data Science & AI

God-Level Data Science ML Full Stack

A collection of scientific methods, processes, algorithms, and systems to build stories & models. This roadmap contains 16 Chapters, whether you are a fresher in the field or an experienced professional who wants to transition into Data Science & AI

The‌ ‌Roadmap‌ ‌is‌ ‌divided‌ ‌into‌ ‌16 ‌Sections‌

Duration:‌ ‌256‌ ‌Hours‌ of Learning ‌(8 ‌Months)‌ ‌and many more hours for practice and project building.

  1. Python‌ ‌Programming‌ ‌and‌ ‌Logic‌ ‌Building‌
  2. Data‌ ‌Structure‌ ‌&‌ ‌Algorithms‌
  3. Pandas‌ ‌Numpy‌ ‌Matplotlib‌
  4. Statistics‌
  5. Machine‌ ‌Learning‌
  6. ML Operations
  7. Natural‌ ‌Language‌ ‌Processing‌
  8. Computer‌ ‌Vision‌‌
  9. Data‌ ‌Visualization‌ ‌with‌ ‌Tableau‌
  10. Structure‌d ‌Query‌ ‌Language‌ ‌(SQL)‌
  11. Data Engineering
  12. Data System Design
  13. Five‌ ‌Major‌ Capstone ‌Projects‌
  14. Interview Preparations
  15. Git & GitHub
  16. Personal Branding and portfolio

Resources

Technology‌ ‌Stack‌

  • Python‌
  • Data‌ ‌Structures‌
  • NumPy‌
  • Pandas‌
  • Matplotlib‌
  • Seaborn‌
  • Scikit-Learn‌
  • Statsmodels‌
  • Natural‌ ‌Language‌ ‌Toolkit‌ ‌(‌ ‌NLTK‌ ‌)‌
  • PyTorch‌
  • OpenCV‌
  • Tableau‌
  • Structure‌ ‌Query‌ ‌Language‌ ‌(‌ ‌SQL‌ ‌)‌
  • PySpark‌
  • Azure‌ ‌Fundamentals‌
  • Azure‌ ‌Data‌ ‌Factory‌
  • Databricks‌
  • 5‌ ‌Major‌ ‌Projects‌
  • Git‌ ‌and‌ ‌GitHub‌ ‌

1 | Python Programming and Logic Building

I will prefer Python Programming Language. Python is the best for starting your programming journey. Here is the roadmap of python for logic building.

  • Python basics, Variables, Operators, Conditional Statements
  • List and Strings
  • While Loop, Nested Loops, Loop Else
  • For Loop, Break, and Continue statements
  • Functions, Return Statement, Recursion
  • Dictionary, Tuple, Set
  • File Handling, Exception Handling
  • Object-Oriented Programming
  • Modules and Packages

In-Depth Roadmap of Python

2 | Data Structure & Algorithms

Data Structure is the most important thing to learn not only for data scientists but for all the people working in computer science. With data structure, you get an internal understanding of the working of everything in software.

Understand these topics

  • Types of Algorithm Analysis
  • Asymptotic Notation, Big-O, Omega, Theta
  • Stacks
  • Queues
  • Linked List
  • Trees
  • Graphs
  • Sorting
  • Searching
  • Hashing

3 | Pandas Numpy Matplotlib

Python supports n-dimensional arrays with Numpy. For data in 2-dimensions, Pandas is the best library for analysis. You can use other tools but tools have drag-and-drop features and have limitations. Pandas can be customized as per the need as we can code depending upon the real-life problem.

Numpy

  • Vectors, Matrix
  • Operations on Matrix
  • Mean, Variance, and Standard Deviation
  • Reshaping Arrays
  • Transpose and Determinant of Matrix
  • Diagonal Operations, Trace
  • Add, Subtract, Multiply, Dot, and Cross Product.

Pandas

  • Series and DataFrames
  • Slicing, Rows, and Columns
  • Operations on DataFrame
  • Different ways to create DataFrame
  • Read, Write Operations with CSV files
  • Handling Missing values, replace values, and Regular Expression
  • GroupBy and Concatenation

Matplotlib

  • Graph Basics
  • Format Strings in Plots
  • Label Parameters, Legend
  • Bar Chart, Pie Chart, Histogram, Scatter Plot

4 | Statistics

Descriptive Statistics

  • Measure of Frequency and Central Tendency
  • Measure of Dispersion
  • Probability Distribution
  • Gaussian Normal Distribution
  • Skewness and Kurtosis
  • Regression Analysis
  • Continuous and Discrete Functions
  • Goodness of Fit
  • Normality Test
  • ANOVA
  • Homoscedasticity
  • Linear and Non-Linear Relationship with Regression

Inferential Statistics

  • t-Test
  • z-Test
  • Hypothesis Testing
  • Type I and Type II errors
  • t-Test and its types
  • One way ANOVA
  • Two way ANOVA
  • Chi-Square Test
  • Implementation of continuous and categorical data

5 | Machine Learning

The best way to master machine learning algorithms is to work with the Scikit-Learn framework. Scikit-Learn contains predefined algorithms and you can work with them just by generating the object of the class. These are the algorithm you must know including the types of Supervised and Unsupervised Machine Learning:

  • Linear Regression
  • Logistic Regression
  • Decision Tree
  • Gradient Descent
  • Random Forest
  • Ridge and Lasso Regression
  • Naive Bayes
  • Support Vector Machine
  • KMeans Clustering

Other Concepts and Topics for ML

  • Measuring Accuracy
  • Bias-Variance Trade-off
  • Applying Regularization
  • Elastic Net Regression
  • Predictive Analytics
  • Exploratory Data Analysis

6 | MLOps

You can master any one of the cloud services provider from AWS, GCP and Azure. You can switch easily once you understand one of them.

We will focus on AWS - Amazon Web Services first

  • Deploy ML models using Flask
  • Amazon Lex - Natural Language Understanding
  • AWS Polly - Voice Analysis
  • Amazon Transcribe - Speech to Text
  • Amazon Textract - Extract Text
  • Amazon Rekognition - Image Applications
  • Amazon SageMaker - Building and deploying models
  • Working with Deep Learning on AWS

7 | Natural Language Processing

If you are interested in working with Text, you should do some of the work an NLP Engineer do and understand the working of Language models.

  • Sentiment analysis
  • POS Tagging, Parsing,
  • Text preprocessing
  • Stemming and Lemmatization
  • Sentiment classification using Naive Bayes
  • TF-IDF, N-gram,
  • Machine Translation, BLEU Score
  • Text Generation, Summarization, ROUGE Score
  • Language Modeling, Perplexity
  • Building a text classifier
  • Identifying the gender

8 | Computer Vision

To work on image and video analytics we can master computer vision. To work on computer vision we have to understand images.

  • PyTorch Tensors
  • Understanding Pretrained models like AlexNet, ImageNet, ResNet.
  • Neural Networks
  • Building a perceptron
  • Building a single layer neural network
  • Building a deep neural network
  • Recurrent neural network for sequential data analysis

Convolutional Neural Networks

  • Understanding the ConvNet topology
  • Convolution layers
  • Pooling layers
  • Image Content Analysis
  • Operating on images using OpenCV-Python
  • Detecting edges
  • Histogram equalization
  • Detecting corners
  • Detecting SIFT feature points

9 | Data Visualization with Tableau

How to use it Visual Perception

  • What is it, How it works, Why Tableau
  • Connecting to Data
  • Building charts
  • Calculations
  • Dashboards
  • Sharing our work
  • Advanced Charts, Calculated Fields, Calculated Aggregations
  • Conditional Calculation, Parameterized Calculation

10 | Structured Query Language (SQL)

  • Fundamental to SQL syntax and Installation
  • Creating Tables, Modifiers
  • Inserting and Retrieving Data, SELECT INSERT UPDATE DELETE
  • Aggregating Data using Functions, Filtering and RegEX
  • Subqueries, retrieve data based on conditions, grouping of Data.
  • Practice Questions
  • JOINs
  • Advanced SQL concepts such as transactions, views, stored procedures, and functions.
  • Database Design principles, normalization, and ER diagrams.
  • Practice, Practice, Practice: Practice writing SQL queries on real-world datasets, and work on projects to apply your knowledge.

11 | Data Engineering

BigData

  • What is BigData?
  • How is BigData applied within Business?

PySpark

  • Resilient Distributed Datasets
  • Schema
  • Lambda Expressions
  • Transformations
  • Actions

Data Modeling

  • Duplicate Data
  • Descriptive Analysis on Data
  • Visualizations
  • ML lib
  • ML Packages
  • Pipelines

Streaming

  • Packaging Spark Applications

12 | Data System Design

What is system design?

  • IP and OSI Model
  • Domain Name System (DNS)
  • Load Balancing
  • Clustering
  • Caching
  • Availability, Scalability, Storage

Databases and DBMS

  • SQL databases
  • NoSQL databases
  • SQL vs NoSQL databases
  • Database Replication
  • Indexes
  • Normalization and Denormalization
  • CAP theorem

System Design Interview

  • URL Shortener
  • Whatsapp, Twitter, Netflix, Uber

13 | Five Major Projects and Git

We follow project-based learning and we will work on all the projects in parallel.

14 | Interview Preperation

15 | Git & GitHub

Git & GitHub Course

  • Understanding Git
  • Commands and How to commit your first code?
  • How to use GitHub?
  • How to make your first open-source contribution?
  • How to work with a team? - Part 1
  • How to create your stunning GitHub profile?
  • How to build your own viral repository?
  • Building a personal landing page for your Portfolio for FREE
  • How to grow followers on GitHub?
  • How to work with a team? Part 2 - issues, milestone and projects

16 | Personal Profile & Portfolio

Resources

Datasets

1️⃣ Awesome Public Datasets This list of a topic-centric public data sources in high quality.

2️⃣NLP Datasets Alphabetical list of free/public domain datasets with text data for use in NLP.

3️⃣Awesome Dataset Tools A curated list of awesome dataset tools.

4️⃣Awesome time series database A curated list of time series databases.

5️⃣Awesome-Cybersecurity-Datasets A curated list of amazingly awesome Cybersecurity datasets.

6️⃣Awesome Robotics Datasets Robotics Dataset Collections.

Research Starting Point

Machine Learning

  1. Introduction to Statistical Learning

Deep Learning

Reinforcement Learning

Projects

Here is the list of project ideas

Data Science ML Full Stack -> Notion Template

Join the WhatsApp Community Group

https://chat.whatsapp.com/BSUPbYhzzM1BcJplcTTIxb

Socials

Join Telegram for Data Science ML AI Resources:

https://t.me/+sREuRiFssMo4YWJl

Connect with me on these platforms:

LinkedIn: https://www.linkedin.com/in/hemansnation/

YouTube: https://www.youtube.com/@Himanshu-Ramchandani

Twitter: https://twitter.com/hemansnation

GitHub: https://github.com/hemansnation

Instagram: https://www.instagram.com/masterdexter.ai/

AI Jobs LinkedIn Group:

https://www.linkedin.com/groups/12540639/

Medium Blog:

https://medium.com/@hemansnation

Notes on Data, Product, and AI - Newsletter:

https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7014799989251956736

Any Query?

Email Me Here: [email protected]

More Repositories

1

Python-Roadmap

Python Roadmap. Learn Python programming as your first programming language. Python for Absolute Beginners, Non-Tech Professionals, 15+ Projects, 30 Topics, 500+ Practice Questions, with Data Structures & Algorithms
Jupyter Notebook
349
star
2

MERN-Stack-Roadmap-2022

Full Stack Roadmap for Software Developer from beginner to Expert with React and Node at Scale.
JavaScript
244
star
3

Data-Analyst-Roadmap

Data-Analyst-Roadmap for Professionals. This roadmap contains 8 Chapters that can be completed in 8 weeks, whether you are a fresher in the field or an experienced professional who wants to transition into Data Analysis.
Jupyter Notebook
98
star
4

Python-For-Data-Professionals

This course is designed to get a good grip on python programming, logic building, solving algorithm-based questions, data structures, understanding of data analytics, working with pandas, professional practices, and API building.
Jupyter Notebook
37
star
5

Machine-Learning-MLOps-GenerativeAI-NLP-CV-MLSystem-Design

MLOps - Deploy models at scale, Generative AI - Build applications with LLMs, NLP - Understand Transformers & Text Generation Models, Computer Vision - Build GANs projects like Deepfakes, ML System Design, hands-on project building and code algorithms from scratch.
Jupyter Notebook
34
star
6

hemansnation

Hi! I am Himanshu Ramchandani
32
star
7

Data-Science-ML-Alpha-2022

Data Science ML Alpha 2022
Python
31
star
8

7-Day-AI-ML-Fundamentals-Workshop

7 Day AI ML Fundamentals Workshop The purpose of this FREE workshop is 1. To give you a boost of getting started with AI. 2. A life-long community with a similar mindset. 3. strong grip on fundamentals that the advanced concepts will be easy to understand.
HTML
24
star
9

Python-and-Data-Structures-for-Data-Science

Jupyter Notebook
15
star
10

Python-For-Beginners

Course for Python Beginners
Jupyter Notebook
15
star
11

Statistics

13
star
12

ArtificialIntelligence

Artificial Intelligence and Machine Learning
9
star
13

React35

35 Days of ReactJS. Understanding the frontend web development with the power of ReactJS along with HTML CSS JavaScript.
JavaScript
8
star
14

LogicalProgrammingWithPython

Logical Programming With Python with Himanshu Ramchandani
Jupyter Notebook
8
star
15

Full-Stack-Data-Science-ML-2022-Roadmap

Roadmap for all those who want to get a kick start as Data Scientist.
8
star
16

Machine-Learning-Engineer

Machine Learning Engineer Roadmap
7
star
17

Python-NumPy-Pandas-Matplotlib-ML

Python-NumPy-Pandas-Matplotlib-ML
Jupyter Notebook
7
star
18

full-stack-roadmap-django-react

Full Stack Web Development Roadmap Django React
6
star
19

ecommerce-frontend

JavaScript
6
star
20

PythonCorePracticePrograms

Python
5
star
21

Advance-Python

Advance Python Programming Logic Building
5
star
22

Data-Structures-and-Algorithms

Data Structures and Algorithms
5
star
23

python-core-roadmap

Python Core Roadmap
5
star
24

JavaScriptForBeginners

JavaScript For Beginners
JavaScript
4
star
25

CaptchaJavaScript

Captcha Generator
JavaScript
4
star
26

hemansnation.github.io

CSS
4
star
27

projectflaskpython

Project Flask Python
HTML
4
star
28

pythoncalculator

Calculator using Python
Python
4
star
29

NumPyCookbookSolve

Jupyter Notebook
4
star
30

ecommerce-backend

JavaScript
4
star
31

git-and-github-workshop

HTML
4
star
32

SaaSCompanies

SaaS Company
3
star
33

ChatbotFlask

Python
3
star
34

FirtstGit

HTML
3
star
35

HTMLforbeginners

HTML for Beginners
HTML
3
star
36

mern-authentication

MERN Authentication
JavaScript
3
star
37

gitproject

HTML
3
star
38

PythonCore

Python Core
HTML
3
star
39

MERN-Stack-Feb-2022

HTML
3
star
40

gitandgithub

HTML
3
star
41

flaskdevelopment

Python
3
star
42

data-system-design

3
star
43

gatsby

3
star
44

gatsby-starter-blog

Gatsby starter for creating a blog
JavaScript
3
star
45

tictactoe

JavaScript
3
star
46

AI-ML-MLOps-GenAI-Live-Summer-Cohort-2024

Jupyter Notebook
3
star
47

MLWorkshopWDC

2
star
48

Second

HTML
2
star
49

HTMLtemplates

HTML
2
star
50

mern-stack-learning

MERN Stack
2
star
51

GitGithubSession

HTML
2
star
52

GitData

HTML
2
star
53

MERN-Stack-Git

HTML
2
star
54

JavaScriptPractice

HTML
2
star
55

cicd-buzz

testing
Python
2
star
56

MLOps-Repo

1
star
57

Git-and-GitHub

Git and GitHub with Himanshu Ramchandani
1
star
58

FlaskSampleML

Python
1
star