• Stars
    star
    172
  • Rank 221,201 (Top 5 %)
  • Language
    Python
  • License
    Other
  • Created over 12 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Simple MNIST data parser written in Python

python-mnist

Simple MNIST and EMNIST data parser written in pure Python.

MNIST is a database of handwritten digits available on http://yann.lecun.com/exdb/mnist/. EMNIST is an extended MNIST database https://www.nist.gov/itl/iad/image-group/emnist-dataset.

Requirements

  • Python 2 or Python 3

Usage

  • git clone https://github.com/sorki/python-mnist

  • cd python-mnist

  • Get MNIST data:

    ./bin/mnist_get_data.sh
    
  • Check preview with:

    PYTHONPATH=. ./bin/mnist_preview
    

Installation

Get the package from PyPi:

pip install python-mnist

or install with setup.py:

python setup.py install

Code sample:

from mnist import MNIST
mndata = MNIST('./dir_with_mnist_data_files')
images, labels = mndata.load_training()

To enable loading of gzip-ed files use:

mndata.gz = True

Library tries to load files named t10k-images-idx3-ubyte train-labels-idx1-ubyte train-images-idx3-ubyte and t10k-labels-idx1-ubyte. If loading throws an exception check if these names match.

EMNIST

  • Get EMNIST data:

    ./bin/emnist_get_data.sh
    
  • Check preview with:

    PYTHONPATH=. ./bin/emnist_preview
    

To use EMNIST datasets you need to call:

mndata.select_emnist('digits')

Where digits is one of the available EMNIST datasets. You can choose from

  • balanced
  • byclass
  • bymerge
  • digits
  • letters
  • mnist

EMNIST loader uses gziped files by default, this can be disabled by by setting:

mndata.gz = False

You also need to unpack EMNIST files as bin/emnist_get_data.sh script won't do it for you. EMNIST loader also needs to mirror and rotate images so it is a bit slower (If this is an issue for you, you should repack the data to avoid mirroring and rotation on each load).

Notes

This package doesn't use numpy by design as when I've tried to find a working implementation all of them were based on some archaic version of numpy and none of them worked. This loads data files with struct.unpack instead.

Example

$ PYTHONPATH=. ./bin/mnist_preview
Showing num: 3

............................
............................
............................
............................
............................
............................
.............@@@@@..........
..........@@@@@@@@@@........
.......@@@@@@......@@.......
.......@@@........@@@.......
.................@@.........
................@@@.........
...............@@@@@........
.............@@@............
.............@.......@......
.....................@......
.....................@@.....
....................@@......
...................@@@......
.................@@@@.......
................@@@@........
....@........@@@@@..........
....@@@@@@@@@@@@............
......@@@@@@................
............................
............................
............................
............................

More Repositories

1

nix-narinfo

Parser and builder for .narinfo files
Haskell
8
star
2

pyjabberbot

Jabberbot fork
Python
7
star
3

pllm

GUI testing framework with machine vision and OCR
Python
5
star
4

fedora-arm-installer

Shell
4
star
5

update-nix-file

Perform arbitrary `nix` file modification operations
Haskell
3
star
6

polytype

Polymorphic Teletype
Haskell
3
star
7

mlp

mlp school project
Python
3
star
8

dfuzz

Daemon configuration fuzzer
Python
2
star
9

hnix-overlay

Nix files and Haskell overlay containing Hnix packages and surrounding ecosystem
Nix
2
star
10

rpi_kickstart

Ansible scripts for kickstarting and managing your Raspberries
Shell
2
star
11

sidc-gui

Sudden ionospheric disturbance collector (sidc) gui
Python
2
star
12

beaker-buildbot

Continous integration tool prototype utlizing Beaker automated testing framework.
Python
2
star
13

rpi-cross

Cross build images for Raspberry Pi from x86
Nix
2
star
14

Haskell-Blockfrost-API

Blockfrost API client
Haskell
2
star
15

django-slides

Step by step guide for creating very basic comment enabled blog application
JavaScript
2
star
16

django-omniauth

Lightweigh user authentication app which instead of trying to provide every authentication backend makes heavy use of widely used apps like django-registration.
Python
2
star
17

cayenne-lpp

Cayene Low Power Protocol encoding and decoding for Haskell
Haskell
1
star
18

ail_gpio

all-is-lost gpio
Shell
1
star
19

48-io-configuration

Didactic-octo-invention
Nix
1
star
20

sidc-sender

pyjabberbot based data transport bot for sidc project
Python
1
star
21

yconfer

Django based conference management application
1
star
22

virt_addr

Get IP address of virtual machine running in kvm via libvirt
Python
1
star
23

bash_kvstore

Shell
1
star
24

autotest2junit

Autoconf Autotest testsuite.log to JUnit testsuite.xml converter
Python
1
star
25

sidc

Sudden ionospheric disturbance collector (sidc)
C
1
star
26

haskell-project-template

Haskell project template buildable with Nix and Stack
Haskell
1
star
27

django-fboauth

Facebook OAuth 2.0 authentication for Django
Python
1
star
28

sidc-replica

pyjabberbot based data replication bot for sidc project
1
star
29

haskell-zre

ZRE protocol implementation
Haskell
1
star
30

account-rush

Bank account number generator for Czech banks.
Python
1
star
31

dotfiles

personal dotfiles
Vim Script
1
star
32

anomaly

Spacetime Anomaly - Openhardware FPV Freestyle frame
1
star
33

no_comments

no_comments strips comments from various file types
1
star
34

django-beakersuite

Python
1
star
35

django-openid-whitelist

openid whitelist application for django
Python
1
star
36

state-server

Single-purpose server holding the current state of the environment
Python
1
star