• Stars
    star
    261
  • Rank 156,630 (Top 4 %)
  • Language
    Jupyter Notebook
  • License
    Apache License 2.0
  • Created over 7 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Repeatable analysis plugin for Jupyter notebook

Nodebook

CircleCI

Nodebook is a plugin for Jupyter Notebook designed to enforce an ordered flow of cell execution. Conceptually, Nodebook notebooks operate like a script where each cell depends on the cells above it. This prevents messy and difficult to maintain out-of-order execution which frequently occurs in vanilla Jupyter notebooks where each cell modifies the global state. For more information, see this post.

Installation

Nodebook is available on pypi and can be installed with pip. Additionally, the jupyter extension must be registered:

pip install nodebook
jupyter nbextension install --py nodebook

Usage

To use Nodebook, add the following lines to a cell in your Jupyter notebook:

#pragma nodebook off
%load_ext nodebook.ipython
%nodebook {mode} {name}

Where {mode} is one of memory or disk, and {name} is a unique identifier for your notebook.

Mode determines whether variables are stored in memory or on disk.

For additional example usage, see nodebook_demo.ipynb. Also see below for a quick demo showing the basic difference in behavior between Nodebook and standard Jupyter:

demo

FAQ

Q: Should I use Python 2 or Python 3 with Nodebook?

There has been an increasing consensus toward sunsetting support for Python 2, including in Project Jupyter. Nodebook currently supports both Python 2 and Python 3, but Python 3 is preferred.

Q: Why am I seeing "ERROR:root:Cell magic %%execute_cell not found."?

Nodebook loads a javascript plugin to modify Jupyter's behavior. If you encounter this error, it means that the javascript plugin is loaded but the ipython plugin is not. This can happen when the javascript is already loaded but you have restarted the kernel and haven't run %nodebook {mode} {name}. The solution is either to run the %nodebook magic (if you want to run nodebook), or delete the cell with the %nodebook magic and refresh your browser to unload the javascript (if you want to turn nodebook off).

Q: What are the tradeoffs between memory and disk mode?

Nodebook serializes all cell outputs to maintain consistent state between cells. In memory mode, objects are serialized to an in-memory dictionary, in disk mode objects are serialized to a directory within your notebook's working directory. Speed can be a factor when choosing between them, but on a modern SSD, serialization time generally dominates and memory and disk mode have similar performance. The main consideration is that disk mode has the advantage of persisting your environment when the python kernel is restarted, but the disadvantage of leaving behind a directory on your local filesystem that you may want to manually clean up later (this can add up especially if you are working with large objects in your notebook).

Q: What are the limitations of Nodebook?

While Nodebook supports most Python operations, it has a few limitations related to the use of serialization. First, not all objects are currently serializable, most noteably generators. Second, serialization adds some extra time. This is imperceptible for small objects, but is noticable for objects larger than a few hundred MB. Instead of working directly with very large objects in Nodebook, I recommend using it to prototype your analysis on a subset of data.

More Repositories

1

pyxley

Python helpers for building dashboards using Flask and React
JavaScript
2,270
star
2

hamilton

A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton
Python
863
star
3

stitches

Create a Microservice in Rails with minimal ceremony
Ruby
552
star
4

fauxtograph

Tools for using a variational auto-encoder for latent image encoding and generation.
Jupyter Notebook
226
star
5

d3-jupyter-tutorial

JavaScript
197
star
6

flotilla-os

Open source Flotilla
Go
192
star
7

immutable-struct

Create struct-like classes that don't have setters, but have an awesome constructor.
Ruby
171
star
8

algorithms-tour

How data science is woven into the fabric of Stitch Fix
HTML
169
star
9

pyxleyJS

Collection of React components for dashboards
JavaScript
155
star
10

Algorithms-Notebooks

Algorithm's team Jupyter Notebooks
Jupyter Notebook
113
star
11

diamond

Python solver for mixed-effects models
Python
98
star
12

colornamer

Given a color, return a hierarchy of names.
Python
89
star
13

resque-brain

NOT MAINTAINED [ Better resque-web that can monitor multiple Resque's in one place ]
Ruby
57
star
14

hello-scrollytelling

A bare-bones version of the scrollytelling framework used in the Algorithms Tour
HTML
52
star
15

pwwka

Interact with RabbitMQ to transmit and receive messages in an easy low-configuration way.
Ruby
51
star
16

mab

Library for multi-armed bandit selection strategies, including efficient deterministic implementations of Thompson sampling and epsilon-greedy.
Go
49
star
17

NTFLib

Sparse Beta-Divergence Tensor Factorization Library
Python
47
star
18

splits

A Python library for dealing with splittable files
Python
42
star
19

context2vec

Using Word2Vec on lists and sets
Python
34
star
20

seetd

Seating optimization
Python
23
star
21

extra_extra

Manage in-app release notes for your Rails application using Markdown
Ruby
20
star
22

tech_radar

Rails engine to manage your team's own Technology Radar
Ruby
16
star
23

MomentMixedModels

A Spark/Scala package for Moment-Based Estimation For Hierarchical Models
Scala
15
star
24

stitchfix.github.io

CSS
15
star
25

merch_calendar

Calculations around the National Retail Federation's 4-5-4 calendar
Ruby
14
star
26

s3drive

S3 backed ContentsManager for jupyter notebooks
Python
13
star
27

resqutils

Handy methods, classes, and test support for applications that use Resque
Ruby
5
star
28

go-postgres-testdb

Library for managing ephemeral test databases in Postgres
Go
3
star
29

redis_ui_rails

A Rails engine for inspecting your Redis instances
Ruby
3
star
30

fittings

Fork of mc-settings, which is a “convenient way to manage ruby application settings/configuration across multiple environments”
Ruby
3
star
31

librato

Librato client library for go
Go
1
star
32

arboreal

Tree based modeling for humans
Python
1
star