• Stars
    star
    262
  • Rank 156,136 (Top 4 %)
  • Language
    Python
  • License
    BSD 3-Clause "New...
  • Created about 10 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Flexible HDF5 saving/loading and other data science tools from the University of Chicago
Documentation Status https://travis-ci.org/uchicago-cs/deepdish.svg?branch=master https://coveralls.io/repos/uchicago-cs/deepdish/badge.svg?branch=master&service=github https://img.shields.io/badge/license-BSD%203--Clause-blue.svg?style=flat

deepdish

Flexible HDF5 saving/loading and other data science tools from the University of Chicago. This repository also host a Deep Learning blog:

Installation

pip install deepdish

Alternatively (if you have conda with the conda-forge channel):

conda install -c conda-forge deepdish

Main feature

The primary feature of deepdish is its ability to save and load all kinds of data as HDF5. It can save any Python data structure, offering the same ease of use as pickling or numpy.save. However, it improves by also offering:

  • Interoperability between languages (HDF5 is a popular standard)
  • Easy to inspect the content from the command line (using h5ls or our specialized tool ddls)
  • Highly compressed storage (thanks to a PyTables backend)
  • Native support for scipy sparse matrices and pandas DataFrame, Series and Panel
  • Ability to partially read files, even slices of arrays

An example:

import deepdish as dd

d = {
    'foo': np.ones((10, 20)),
    'sub': {
        'bar': 'a string',
        'baz': 1.23,
    },
}
dd.io.save('test.h5', d)

This can be reconstructed using dd.io.load('test.h5'), or inspected through the command line using either a standard tool:

$ h5ls test.h5
foo                      Dataset {10, 20}
sub                      Group

Or, better yet, our custom tool ddls (or python -m deepdish.io.ls):

$ ddls test.h5
/foo                       array (10, 20) [float64]
/sub                       dict
/sub/bar                   'a string' (8) [unicode]
/sub/baz                   1.23 [float64]

Read more at Saving and loading data.

Documentation

More Repositories

1

cmsc12300

CMSC 12300 - Computer Science with Applications 3
C++
75
star
2

cmsc23300

CMSC 23300 - Networks and Distributed Systems
66
star
3

chidb

C
56
star
4

chiventure

A text adventure game engine developed in UChicago's CMSC 22000 - Introduction to Software Development
C
40
star
5

job-board

A simple job board for the Department of Computer Science
Ruby
19
star
6

chirc

A simple IRC server
Python
19
star
7

chitcp

chiTCP - A simple, testable TCP stack
C
14
star
8

plrg

PL Reading Group Website
HTML
14
star
9

chisubmit

A project submission system for university courses
Python
9
star
10

cmsc23320

CMSC 23320 - Foundations of Computer Networks
Python
8
star
11

student-resource-guide

UChicago CS Student Resource Guide
Python
7
star
12

chistributed

Python
7
star
13

stcp

STCP
C
6
star
14

pintos

UChicago version of the Pintos instructional kernel
HTML
6
star
15

python-practice-problems

A collection of Python practice problems
Python
6
star
16

studentactivities

cs student activities fund
JavaScript
5
star
17

simple-router

Simple Router assignment for CMSC 23300 (clone of https://bitbucket.org/huangty/cs144_lab3)
C
5
star
18

cmsc35200

Python
4
star
19

mpcs-c-bootcamp

MPCS C Bootcamp
C
3
star
20

chigame

Python
3
star
21

gradescope-autograder-examples

Python
2
star
22

debugging-guide

The Debugging Guide
2
star
23

cspp53001

2
star
24

connectm

Connect-M: A sample CMSC 14200 course project implementation
Python
2
star
25

chimera

Chimera: A Simple Game Development Framework
Python
1
star
26

practicum-webapp

Web application to manage the department's Practicum program
Ruby
1
star
27

cmsc23310

TeX
1
star
28

cs121-officehours-webapp

Python
1
star
29

placement-practice-problems

Practice problems for CMSC placement exam
C++
1
star
30

chirouter

A simple, testable IP router
C
1
star