IPython Cookbook, Second Edition (2018)
IPython Interactive Computing and Visualization Cookbook, Second Edition (2018), by Cyrille Rossant, contains over 100 hands-on recipes on high-performance numerical computing and data science in the Jupyter Notebook.
This repository contains the sources of the book (in Markdown, CC-BY-NC-ND license).
Contents
Chapter 1 : A Tour of Interactive Computing with Jupyter and IPython
- 1.1. Introducing IPython and the Jupyter Notebook
- 1.2. Getting started with exploratory data analysis in the Jupyter Notebook
- 1.3. Introducing the multidimensional array in NumPy for fast array computations
- 1.4. Creating an IPython extension with custom magic commands
- 1.5. Mastering IPython's configuration system
- 1.6. Creating a simple kernel for Jupyter
Chapter 2 : Best practices in Interactive Computing
- 2.1. Learning the basics of the Unix shell
- 2.2. Using the latest features of Python 3
- 2.3. Learning the basics of the distributed version control system Git
- 2.4. A typical workflow with Git branching
- 2.5. Efficient interactive computing workflows with IPython
- 2.6. Ten tips for conducting reproducible interactive computing experiments
- 2.7. Writing high-quality Python code
- 2.8. Writing unit tests with py.test
- 2.9. Debugging code with IPython *
Chapter 3 : Mastering the Jupyter Notebook
- 3.1. Teaching programming in the Notebook with IPython blocks
- 3.2. Converting a Jupyter notebook to other formats with nbconvert
- 3.3. Mastering widgets in the Jupyter Notebook
- 3.4. Creating custom Jupyter Notebook widgets in Python, HTML, and JavaScript
- 3.5. Configuring the Jupyter Notebook *
- 3.6. Introducing JupyterLab
Chapter 4 : Profiling and Optimization
- 4.1. Evaluating the time taken by a command in IPython *
- 4.2. Profiling your code easily with cProfile and IPython
- 4.3. Profiling your code line-by-line with line_profiler
- 4.4. Profiling the memory usage of your code with memory_profiler
- 4.5. Understanding the internals of NumPy to avoid unnecessary array copying
- 4.6. Using stride tricks with NumPy
- 4.7. Implementing an efficient rolling average algorithm with stride tricks
- 4.8. Processing large NumPy arrays with memory mapping
- 4.9. Manipulating large arrays with HDF5 *
Chapter 5 : High-Performance Computing
- 5.1. Knowing Python to write faster code
- 5.2. Accelerating pure Python code with Numba and just-in-time compilation
- 5.3. Accelerating array computations with Numexpr
- 5.4. Wrapping a C library in Python with ctypes
- 5.5. Accelerating Python code with Cython
- 5.6. Optimizing Cython code by writing less Python and more C
- 5.7. Releasing the GIL to take advantage of multi-core processors with Cython and OpenMP
- 5.8. Writing massively parallel code for NVIDIA graphics cards (GPUs) with CUDA
- 5.9. Distributing Python code across multiple cores with IPython
- 5.10. Interacting with asynchronous parallel tasks in IPython
- 5.11. Performing out-of-core computations on large arrays with Dask
- 5.12. Trying the Julia programming language in the Jupyter Notebook *
Chapter 6 : Data Visualization
- 6.1. Using matplotlib styles
- 6.2. Creating statistical plots easily with seaborn
- 6.3. Creating interactive Web visualizations with Bokeh and HoloViews
- 6.4. Visualizing a NetworkX graph in the Notebook with D3.js
- 6.5. Discovering interactive visualization libraries in the Notebook *
- 6.6. Creating plots with Altair and the Vega-Lite specification
Chapter 7 : Statistical Data Analysis
- 7.1. Exploring a dataset with pandas and matplotlib
- 7.2. Getting started with statistical hypothesis testing — a simple z-test
- 7.3. Getting started with Bayesian methods
- 7.4. Estimating the correlation between two variables with a contingency table and a chi-squared test
- 7.5. Fitting a probability distribution to data with the maximum likelihood method
- 7.6. Estimating a probability distribution nonparametrically with a kernel density estimation
- 7.7. Fitting a Bayesian model by sampling from a posterior distribution with a Markov Chain Monte Carlo method
- 7.8. Analyzing data with the R programming language in the Jupyter Notebook *
Chapter 8 : Machine Learning
- 8.1. Getting started with scikit-learn
- 8.2. Predicting who will survive on the Titanic with logistic regression *
- 8.3. Learning to recognize handwritten digits with a K-nearest neighbors classifier
- 8.4. Learning from text — Naive Bayes for Natural Language Processing
- 8.5. Using support vector machines for classification tasks
- 8.6. Using a random forest to select important features for regression
- 8.7. Reducing the dimensionality of a dataset with a principal component analysis *
- 8.8. Detecting hidden structures in a dataset with clustering
Chapter 9 : Numerical Optimization
- 9.1. Finding the root of a mathematical function *
- 9.2. Minimizing a mathematical function
- 9.3. Fitting a function to data with nonlinear least squares
- 9.4. Finding the equilibrium state of a physical system by minimizing its potential energy
Chapter 10 : Signal Processing
- 10.1. Analyzing the frequency components of a signal with a Fast Fourier Transform
- 10.2. Applying a linear filter to a digital signal
- 10.3. Computing the autocorrelation of a time series
Chapter 11 : Image and Audio Processing
- 11.1. Manipulating the exposure of an image
- 11.2. Applying filters on an image
- 11.3. Segmenting an image
- 11.4. Finding points of interest in an image
- 11.5. Detecting faces in an image with OpenCV *
- 11.6. Applying digital filters to speech sounds
- 11.7. Creating a sound synthesizer in the Notebook
Chapter 12 : Deterministic Dynamical Systems
- 12.1. Plotting the bifurcation diagram of a chaotic dynamical system
- 12.2. Simulating an elementary cellular automaton
- 12.3. Simulating an ordinary differential equation with SciPy
- 12.4. Simulating a partial differential equation — reaction-diffusion systems and Turing patterns
Chapter 13 : Stochastic Dynamical Systems
- 13.1. Simulating a discrete-time Markov chain
- 13.2. Simulating a Poisson process *
- 13.3. Simulating a Brownian motion
- 13.4. Simulating a stochastic differential equation
Chapter 14 : Graphs, Geometry, and Geographic Information Systems
- 14.1. Manipulating and visualizing graphs with NetworkX *
- 14.2. Drawing flight routes with NetworkX
- 14.3. Resolving dependencies in a directed acyclic graph with a topological sort
- 14.4. Computing connected components in an image
- 14.5. Computing the Voronoi diagram of a set of points
- 14.6. Manipulating geospatial data with Cartopy
- 14.7. Creating a route planner for a road network
Chapter 15 : Symbolic and Numerical Mathematics
- 15.1. Diving into symbolic computing with SymPy
- 15.2. Solving equations and inequalities
- 15.3. Analyzing real-valued functions
- 15.4. Computing exact probabilities and manipulating random variables
- 15.5. A bit of number theory with SymPy
- 15.6. Finding a Boolean propositional formula from a truth table
- 15.7. Analyzing a nonlinear differential system — Lotka-Volterra (predator-prey) equations
- 15.8. Getting started with Sage *
Recipes marked with an asterisk * are only available in the book.
Contributing
For any comment, question, or error, please open an issue or propose a pull request.
Presentation
Python is one of the leading open source platforms for data science and numerical computing. IPython and the associated Jupyter Notebook offer efficient interfaces to Python for data analysis and interactive visualization, and they constitute an ideal gateway to the platform.
IPython Interactive Computing and Visualization Cookbook, Second Edition contains many ready-to-use, focused recipes for high-performance scientific computing and data analysis, from the latest IPython/Jupyter features to the most advanced tricks, to help you write better and faster code. You will apply these state-of-the-art methods to various real-world examples, illustrating topics in applied mathematics, scientific modeling, and machine learning.
The first part of the book covers programming techniques: code quality and reproducibility, code optimization, high- performance computing through just-in-time compilation, parallel computing, and graphics card programming. The second part tackles data science, statistics, machine learning, signal and image processing, dynamical systems, and pure and applied mathematics