awesome-pandas
A collection of resources for pandas (Python) and related subjects. Pull requests are very welcome!
Contents: This is an unofficial collection of resources for learning pandas, an open source Python library for data analysis. Here you will find videos, cheat-sheets, tutorials and books / papers. The curated list is divided into three parts:
- pandas resources - A collection of videos, cheat-sheets, tutorials and books directly related to pandas.
- Data analysis with Python resources - Material related to adjacent Python libraries and software such as NumPy, scipy, matplotlib, seaborn, statsmodels and Jupyter.
- Miscellaneous related resources - Resources related to general data analysis, Python programming, algorithms, computer science, machine learning, statistics, etc.
- Packages - Python packages for helping to work with Pandas.
(1) πΌ pandas resources
(1.1) πΊ Videos
The videos below were collected in July of 2018. They are all directly related to pandas, and the Level of a video is quantified roughly as follows:
- π : Beginner - requires little knowledge to jump into, elementary topics.
- π : Intermediate - some prior knowledge needed, more technical.
- π± : Advanced - very technical, or discusses advanced topics.
- β : Recommended video - high quality video and audio, great presentation.
Title | Speaker | Uploader | Time | Views | Year | Level |
---|---|---|---|---|---|---|
Pandas tutorial for Data Science | Bikram Kundu | - | > 01:20 | 2K+ | 2022 | π |
Python for Data Analysis using Pandas part 1 & part 2 [repo] | tommyod | na | 2:19 | 100 | 2019 | π |
Data Science Best Practices with pandas [repo] | Kevin Markham | PyCon | 3:23 | 1000 | 2019 | π |
Thinking like a Panda | Hannah Stepanek | PyCon | 0:36 | 700 | 2019 | π |
Analyzing Census Data with Pandas [repo] | Sergio SΓ‘nchez | PyCon | 3:15 | 600 | 2019 | π |
Pandas is for Everyone [repo] | Daniel Chen | PyCon | 3:18 | 600 | 2019 | π |
β Pandas From The Ground Up [repo] | Brandon Rhodes | PyCon 2015 | 2:24 | 91000 | 2015 | π |
Introduction Into Pandas [repo] | Daniel Chen | Python Tutorial | 1:28 | 46000 | 2017 | π |
Introduction To Data Analytics With Pandas [repo] | Quentin Caudron | Python Tutorial | 1:51 | 25000 | 2017 | π |
Pandas for Data Analysis [repo] | Daniel Chen | Enthought | 3:45 | 13000 | 2017 | π |
Optimizing Pandas Code [repo] | Sofia Heisler | PyCon 2017 | 0:29 | 12000 | 2017 | π |
A Visual Guide To Pandas | Jason Wirth | Next Day Video | 0:26 | 49000 | 2015 | π |
Analyzing and Manipulating Data with Pandas [repo] | Jonathan Rocher | Enthought | 3:33 | 22000 | 2016 | π |
Time Series Analysis [repo] | Aileen Nielsen | PyCon 2017 | 3:11 | 9000 | 2017 | π |
Predicting sports winners with pandas | Robert Layton | PyCon Australia | 0:38 | 13000 | 2015 | π |
Pandas from the Inside [repo] [2016 talk] | Stephen Simmons | PyData | 1:17 | 3000 | 2017 | π± |
Pandas part 1 & part 2 [repo] | Joris Van den Bossche | EuroSciPy | 3:03 | 1000 | 2017 | π |
Pandas: .head() to .tail() [repo] | Tom Augspurger | PyData | 1:26 | 3000 | 2016 | π |
Performance Pandas (london) [repo] | Jeff Reback | PyData | 0:43 | 2000 | 2015 | π |
Performance Pandas (NYC) [repo] | Jeff Reback | PyData | 1:26 | 3000 | 2015 | π |
Python Data Science with pandas [repo] | Matt Harrison | JetBrainsTV | 1:09 | 2000 | 2018 | π |
What is the Future of Pandas [slides] | Jeff Reback | PyData | 0:31 | 4000 | 2017 | π |
Introduction to Python for Data Science [repo] | Skipper Seabold | PyData | 3:18 | 300 | 2018 | π |
Pandas for Better (and Worse) Data Science [repo] | Kevin Markham | PyCon 2018 | 3:21 | 3000 | 2018 | π |
Know of a recent, good video? Send a pull request! π
(1.2) β Cheat-sheets
- Data Wrangling with pandas
- The pandas DataFrame Object
- Python For Data Science - pandas Basics
- Python For Data Science - pandas
(1.3) π Tutorials
- β 10 Minutes to pandas
- β pandas_exercises
- β pycon-pandas-tutorial [Video: Pandas From The Ground Up]
- β Learn-Pandas
- β pandas-cookbook
- β Modern pandas. Parts: 1, 2, 3, 4, 5, 6 and 7.
- pandas-tutorial [Video: Pandas & Advanced Pandas]
- pandas_tutorial [Video: Analyzing and Manipulating Data with Pandas]
- scipy-2017-tutorial-pandas [Video: Pandas for Data Analysis]
- Pandas-Tutorial
- sklearn_pandas_tutorial
- pandas_basics
- first-python-notebook
- Learn Pandas
- Pandas practice website
(1.4) π Books / papers
- [amazon] McKinney, Wes. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. 2 edition. OβReilly Media, 2017.
- [amazon] VanderPlas, Jake. Python Data Science Handbook: Essential Tools for Working with Data. 1 edition. OβReilly Media, 2016.
- [manning] Lerner, Reuven. 50 exercises that will strengthen your pandas skills to a level of automatic fluency. 1 edition. Manning Publications, 2021.
- [manning] Paskhaver, Boris. This friendly and hands-on guide shows you how to start mastering Pandas with skills you already know from spreadsheet software.. 1 edition. Manning Publications, 2021.
(2) Data analysis with Python resources
(2.1) πΊ Videos
Title | Speaker | Uploader | Time | Views | Keyword | Year | Level |
---|---|---|---|---|---|---|---|
NumPy Beginner [repo] | Alexandre Chabot LeClerc | Enthought | 2:47 | 56000 | NumPy | 2016 | π |
Machine Learning | Andreas Mueller & Sebastian Raschka | Enthought | 3:03 | 47000 | sklearn | 2016 | π |
The Python Visualization Landscape | Jake VanderPlas | PyCon 2017 | 0:33 | 21000 | python | 2017 | π |
JupyterLab: Building Blocks for Interactive Computing | Brian Granger | Enthought | 0:29 | 28000 | jupyter | 2016 | π |
Machine Learning with Scikit Learn [repo] | Andreas Mueller & Kyle Kastner | Enthought | 3:22 | 48000 | sklearn | 2015 | π |
Machine Learning for Time Series Data in Python | Brett Naul | Enthought | 0:24 | 24000 | cesium | 2016 | π |
Computational Statistics [repo] | Allen Downey | Enthought | 2:05 | 10000 | scipy | 2017 | π |
Time Series Analysis [repo] | Aileen Nielsen | PyCon 2017 | 3:11 | 9000 | pandas | 2017 | π |
Learning TensorFlow | Robert Layton | PyCon Australia | 0:40 | 18000 | tensorflow | 2016 | π |
JupyterHub: Deploying Jupyter Notebooks | Min Ragan Kelley & Thomas Kluyver | PyData | 1:36 | 17000 | jupyter | 2016 | π |
Applied Time Series Econometrics | Jeffrey Yau | PyData | 1:39 | 17000 | statsmodels | 2016 | π |
Machine Learning with scikit learn [repo] | Andreas Mueller & Alexandre Gram | Enthought | 3:10 | 8000 | sklearn | 2017 | π |
Introduction to Numerical Computing with NumPy | Dillon Niederhut | Enthought | 2:27 | 8000 | NumPy | 2017 | π |
Dask - A Pythonic Distributed Data Science Framework | Matthew Rocklin | PyCon 2017 | 0:46 | 7000 | dask | 2017 | π |
Introduction to Statistical Modeling with Python [repo] | Christopher Fonnesbeck | PyCon 2017 | 3:19 | 7000 | scipy | 2017 | π |
Fully Convolutional Networks for Image Segmentation | Daniil Pakhomov | Enthought | 0:20 | 7000 | scipy | 2017 | π |
Exploratory data analysis in python [repo] | Chloe Mawer & Jonathan Whitmore | PyCon 2017 | 2:54 | 7000 | scipy | 2017 | π |
Libraries for Deep Learning with Sequences | Alex Rubinsteyn | PyData | 0:44 | 23000 | scipy | 2015 | π |
Numba - Tell Those C++ Bullies to Get Lost [repo] | Gil Forsyth & Lorena Barba | Enthought | 2:25 | 5000 | numba | 2017 | π |
Deploying Interactive Jupyter Dashboards | Philipp Rudiger | Enthought | 0:18 | 5000 | jupyter | 2017 | π |
Data Science Using Functional Python | Joel Grus | PyData | 0:44 | 18000 | python | 2015 | π |
Anatomy of matplotlib [repo] | Benjamin Root & Joe Kington | Enthought | 3:18 | 18000 | matplotlib | 2015 | π |
Anatomy of matplotlib [repo] | Benjamin Root | Enthought | 3:02 | 4000 | matplotlib | 2017 | π |
Data Science is Software [repo] | Peter Bull & Isaac Slavitt | Enthought | 2:12 | 9000 | jupyter | 2016 | π |
Machine Learning with Scikit Learn [repo] | Jake VanderPlas | PyData | 1:34 | 16000 | sklearn | 2015 | π |
Using Jupyter notebooks [repo] | Ioanna Ioannou | PyCon Australia | 0:28 | 8000 | jupyter | 2016 | π |
Parallel Python: Analyzing Large Datasets [repo] | Matthew Rocklin | Enthought | 3:05 | 7000 | scipy | 2016 | π± |
Keynote: Project Jupyter | Brian Granger | Enthought | 0:48 | 7000 | jupyter | 2016 | π |
matplotlib beginner tutorial [repo] | Nicolas Rougier | Enthought | 2:59 | 6000 | matplotlib | 2016 | π |
Awesome Big Data Algorithms | Titus Brown | Next Day Video | 0:39 | 41000 | python | 2013 | π± |
All About Jupyter | Brian Granger | PyData | 0:39 | 11000 | jupyter | 2015 | π |
PyMC: Markov Chain Monte Carlo | Chris Fonnesbeck | Enthought | 0:20 | 9000 | pyMC | 2014 | π |
Jupyter Advanced Topics Tutorial [repo] | Jonathan Frederic & Matthias Bussonier | Enthought | 2:48 | 4000 | jupyter | 2015 | π± |
Using randomness to make code much faster | Rachel Thomas | SF Python | 0:54 | 1000 | scipy | 2017 | π |
Python Profiling & Performance | Mahmoud Hashemi | SF Python | 0:28 | 1000 | python | 2016 | π |
Using List Comprehensions and Generator Expressions | Trey Hunner | PyCon 2018 | 3:21 | 3000 | python | 2018 | π |
Foundations of Numerical Computing | Scott Sanderson | PyCon 2018 | 3:22 | 1000 | python | 2018 | π |
(2.2) β Cheat-sheets
(2.3) π Tutorials
(2.4) π Books / papers
- Varoquaux, Gael, Valentin Haenel, Emmanuelle Gouillart, Zbigniew JΔdrzejewski-Szmek, Ralf Gommers, Fabian Pedregosa, Olav Vahtras, et al. Scipy Lecture Notes. Zenodo, September 28, 2015. https://doi.org/10.5281/zenodo.31521.
- [amazon] Nunez-Iglesias, Juan, StΓ©fan van der Walt, and Harriet Dashnow. Elegant SciPy: The Art of Scientific Python. 1 edition. OβReilly Media, 2017.
- Rougier, Nicolas P. From Python to Numpy Zenodo, December 31, 2016. https://doi.org/10.5281/zenodo.225783.
- [amazon] GΓ©ron, AurΓ©lien. Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. 1 edition. OβReilly Media, 2017.
(3) Miscellaneous related resources
(3.1) πΊ Videos
Title | Speaker | Uploader | Time | Views | Keyword | Year | Level |
---|---|---|---|---|---|---|---|
β So you want to be a Python expert? | James Powell | PyData | 1:54 | 28000 | python | 2017 | π± |
β Transforming Code into Beautiful, Idiomatic Python | Raymond Hettinger | Next Day Video | 0:48 | 340000 | python | 2013 | π |
β Builtin Superheroes | David Beazley | David Beazley | 0:44 | 12000 | python | 2016 | π |
How to become a Data Scientist in 6 months | Tetiana Ivanova | PyData | 0:56 | 148000 | misc | 2016 | π |
Modern Dictionaries | Raymond Hettinger | SF Python | 1:07 | 44000 | python | 2016 | π |
Keynote on Concurrency | Raymond Hettinger | SF Python | 1:13 | 15000 | python | 2017 | π |
The Fun of Reinvention | David Beazley | David Beazley | 0:52 | 11000 | python | 2017 | π± |
Being a Core Developer in Python | Raymond Hettinger | SF Python | 1:02 | 19000 | python | 2016 | π |
Visualizing Geographic Data | Christopher Roach | PyData | 0:31 | 14000 | python | 2016 | π |
Python's Class Development Toolkit | Raymond Hettinger | Next Day Video | 0:45 | 80000 | python | 2013 | π |
The Other Async (Threads + Async = β€οΈ) - YouTube | David Beazley | David Beazley | 0:47 | 5000 | python | 2017 | π± |
Functional Programming with Python | Mike MΓΌller | Next Day Video | 0:27 | 44000 | python | 2013 | Novice |
Building a Recommendation Engine using Python | Anusua Trivedi | PyData | 0:37 | 11000 | python | 2015 | Novice |
Iterations of Evolution | David Beazley | David Beazley | 0:34 | 2000 | python | 2017 | Novice |
"Good Enough" IS Good Enough! | Alex Martelli | SF Python | 0:53 | 4000 | python | 2016 | Novice |
Automating Code Quality | Kyle Knapp | PyCon 2018 | 0:30 | 3000 | python | 2018 | π |
(3.2) β Cheat-sheets
(3.3) π Tutorials
(3.4) π Books / papers
- [amazon] Slatkin, Brett. Effective Python: 59 Specific Ways to Write Better Python. 1 edition. Addison-Wesley Professional, 2015.
- [amazon] Ramalho, Luciano. Fluent Python. 1st edition. OβReilly, 2015.
- [pdf] P Rougier, Nicolas, Michael Droettboom, and Philip Bourne. "Ten Simple Rules for Better Figures." PLoS Computational Biology 10 (September 1, 2014): e1003833. https://doi.org/10.1371/journal.pcbi.1003833.
- [pdf] Tidy Data | Wickham | Journal of Statistical Software. Accessed December 31, 2017. https://doi.org/10.18637/jss.v059.i10.
- [amazon] [online] Chacon, Scott, and Ben Straub. Pro Git. 2nd ed. edition. New York, NY: Apress, 2014.
The books below are perhaps of an even more general nature.
- [amazon] Dasgupta, Sanjoy, Christos H. . Papadimitriou, and Umesh Virkumar. Vazirani. Algorithms. Boston, Mass: McGraw Hill, 2008.
- [amazon] Lloyd N. Trefethen. Numerical Linear Algebra. Society for Industrial and Applied Mathematics, 1997.
- [amazon] Gene H. Golub. Matrix Computations. 4th ed. Johns Hopkins Studies in the Mathematical Sciences. Baltimore: Johns Hopkins University Press, 2013.
Every video is below.
Title | Speaker | Uploader | Time | Views | Keyword | Year | Level |
---|---|---|---|---|---|---|---|
How to become a Data Scientist in 6 months | Tetiana Ivanova | PyData | 0:56 | 148000 | misc | 2016 | π |
Introduction Into Pandas | Daniel Chen | Python Tutorial | 1:28 | 46000 | pandas | 2017 | π |
So you want to be a Python expert? | James Powell | PyData | 1:54 | 28000 | python | 2017 | πππ |
NumPy Beginner [repo] | Alexandre Chabot LeClerc | Enthought | 2:47 | 56000 | NumPy | 2016 | π π |
Introduction To Data Analytics With Pandas | Quentin Caudron | Python Tutorial | 1:51 | 25000 | pandas | 2017 | π |
Transforming Code into Beautiful, Idiomatic Python | Raymond Hettinger | Next Day Video | 0:48 | 340000 | python | 2013 | π |
Machine Learning | Andreas Mueller & Sebastian Raschka | Enthought | 3:03 | 47000 | sklearn | 2016 | π π |
Pandas From The Ground Up [repo] | Brandon Rhodes | PyCon 2015 | 2:24 | 91000 | pandas | 2015 | π π |
Modern Dictionaries | Raymond Hettinger | SF Python | 1:07 | 44000 | python | 2016 | π π |
The Python Visualization Landscape | Jake VanderPlas | PyCon 2017 | 0:33 | 21000 | python | 2017 | π |
Keynote on Concurrency | Raymond Hettinger | SF Python | 1:13 | 15000 | python | 2017 | ππ |
Pandas for Data Analysis [repo] | Daniel Chen | Enthought | 3:45 | 13000 | pandas | 2017 | ππ |
JupyterLab: Building Blocks for Interactive Computing | Brian Granger | Enthought | 0:29 | 28000 | jupyter | 2016 | π |
Optimizing Pandas Code for Speed and Efficiency | Sofia Heisler | PyCon 2017 | 0:29 | 12000 | pandas | 2017 | π π |
A Visual Guide To Pandas | Jason Wirth | Next Day Video | 0:26 | 49000 | pandas | 2015 | π |
Machine Learning with Scikit Learn [repo] | Andreas Mueller & Kyle Kastner | Enthought | 3:22 | 48000 | sklearn | 2015 | π π |
Machine Learning for Time Series Data in Python | Brett Naul | Enthought | 0:24 | 24000 | cesium | 2016 | π |
The Fun of Reinvention | David Beazley | David Beazley | 0:52 | 11000 | python | 2017 | πππ |
Analyzing and Manipulating Data with Pandas [repo] | Jonathan Rocher | Enthought | 3:33 | 22000 | pandas | 2016 | π |
Computational Statistics [repo] | Allen Downey | Enthought | 2:05 | 10000 | scipy | 2017 | π π |
Being a Core Developer in Python | Raymond Hettinger | SF Python | 1:02 | 19000 | python | 2016 | π |
Time Series Analysis [repo] | Aileen Nielsen | PyCon 2017 | 3:11 | 9000 | pandas | 2017 | π π |
Learning TensorFlow | Robert Layton | PyCon Australia | 0:40 | 18000 | tensorflow | 2016 | π π |
JupyterHub: Deploying Jupyter Notebooks | Min Ragan Kelley & Thomas Kluyver | PyData | 1:36 | 17000 | jupyter | 2016 | π |
Applied Time Series Econometrics | Jeffrey Yau | PyData | 1:39 | 17000 | statsmodels | 2016 | π π |
Machine Learning with scikit learn [repo] | Andreas Mueller & Alexandre Gram | Enthought | 3:10 | 8000 | sklearn | 2017 | π π |
Introduction to Numerical Computing with NumPy | Dillon Niederhut | Enthought | 2:27 | 8000 | NumPy | 2017 | π |
Dask - A Pythonic Distributed Data Science Framework | Matthew Rocklin | PyCon 2017 | 0:46 | 7000 | dask | 2017 | π π |
Introduction to Statistical Modeling with Python [repo] | Christopher Fonnesbeck | PyCon 2017 | 3:19 | 7000 | scipy | 2017 | π π |
Fully Convolutional Networks for Image Segmentation | Daniil Pakhomov | Enthought | 0:20 | 7000 | scipy | 2017 | π |
Exploratory data analysis in python [repo] | Chloe Mawer & Jonathan Whitmore | PyCon 2017 | 2:54 | 7000 | scipy | 2017 | π |
Visualizing Geographic Data | Christopher Roach | PyData | 0:31 | 14000 | python | 2016 | π |
Builtin Superheroes | David Beazley | David Beazley | 0:44 | 12000 | python | 2016 | π π |
Python's Class Development Toolkit | Raymond Hettinger | Next Day Video | 0:45 | 80000 | python | 2013 | π π |
Libraries for Deep Learning with Sequences | Alex Rubinsteyn | PyData | 0:44 | 23000 | scipy | 2015 | π π |
The Other Async (Threads + Async = β€οΈ) - YouTube | David Beazley | David Beazley | 0:47 | 5000 | python | 2017 | π π π |
Numba - Tell Those C++ Bullies to Get Lost [repo] | Gil Forsyth & Lorena Barba | Enthought | 2:25 | 5000 | numba | 2017 | π π |
Deploying Interactive Jupyter Dashboards | Philipp Rudiger | Enthought | 0:18 | 5000 | jupyter | 2017 | π π |
Eyal Trabelsi - Practical Optimisations for Pandas | Eyal Trabelsi | Europython | 0:45 | 5000 | jupyter | 2020 | π π |
Data Science Using Functional Python | Joel Grus | PyData | 0:44 | 18000 | python | 2015 | π π |
Pandas from the Inside | Stephen Simmons | PyData | 1:20 | 9000 | pandas | 2016 | π π π |
Anatomy of matplotlib [repo] | Benjamin Root & Joe Kington | Enthought | 3:18 | 18000 | matplotlib | 2015 | π π |
Anatomy of matplotlib [repo] | Benjamin Root | Enthought | 3:02 | 4000 | matplotlib | 2017 | π π |
Data Science is Software [repo] | Peter Bull & Isaac Slavitt | Enthought | 2:12 | 9000 | jupyter | 2016 | π |
Machine Learning with Scikit Learn [repo] | Jake VanderPlas | PyData | 1:34 | 16000 | sklearn | 2015 | Novice |
Using Jupyter notebooks | Ioanna Ioannou | PyCon Australia | 0:28 | 8000 | jupyter | 2016 | Novice |
Parallel Python: Analyzing Large Datasets [repo] | Matthew Rocklin | Enthought | 3:05 | 7000 | scipy | 2016 | Novice |
Functional Programming with Python | Mike MΓΌller | Next Day Video | 0:27 | 44000 | python | 2013 | Novice |
Predicting sports winners with pandas and scikit-learn | Robert Layton | PyCon Australia | 0:38 | 13000 | pandas | 2015 | Novice |
Keynote: Project Jupyter | Brian Granger | Enthought | 0:48 | 7000 | jupyter | 2016 | Novice |
matplotlib beginner tutorial [repo] | Nicolas Rougier | Enthought | 2:59 | 6000 | matplotlib | 2016 | Novice |
Awesome Big Data Algorithms | Titus Brown | Next Day Video | 0:39 | 41000 | python | 2013 | Novice |
Pandas from the Inside | Stephen Simmons | PyData | 1:17 | 3000 | pandas | 2017 | Novice |
All About Jupyter | Brian Granger | PyData | 0:39 | 11000 | jupyter | 2015 | Novice |
Building a Recommendation Engine using Python | Anusua Trivedi | PyData | 0:37 | 11000 | python | 2015 | Novice |
Iterations of Evolution | David Beazley | David Beazley | 0:34 | 2000 | python | 2017 | Novice |
"Good Enough" IS Good Enough! | Alex Martelli | SF Python | 0:53 | 4000 | python | 2016 | Novice |
PyMC: Markov Chain Monte Carlo | Chris Fonnesbeck | Enthought | 0:20 | 9000 | pyMC | 2014 | Novice |
Jupyter Advanced Topics Tutorial [repo] | Jonathan Frederic & Matthias Bussonier | Enthought | 2:48 | 4000 | jupyter | 2015 | Novice |
Using randomness to make code much faster | Rachel Thomas | SF Python | 0:54 | 1000 | scipy | 2017 | Novice |
Python Profiling & Performance | Mahmoud Hashemi | SF Python | 0:28 | 1000 | python | 2016 | Novice |
(4) Packages
- datatest - Tools for test driven data-wrangling and data validation (DataFrame, Series, Index, MultiIndex).
- pandera - A light-weight, flexible, and expressive data validation library for dataframes.
- pandas-vet - A plugin for Flake8 that checks pandas code.