• Stars
    star
    155
  • Rank 240,864 (Top 5 %)
  • Language
    Jupyter Notebook
  • Created over 7 years ago
  • Updated about 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Caption This!

This repository contains source code corresponding to our article "Caption this, with TensorFlow!"

Git Basics

  1. Go to your home directory by opening your terminal and entering cd ~

  2. Clone the repository by entering

    git clone https://github.com/mlberkeley/oreilly-captions.git
    

Docker (highly recommended)

Install Docker using the platform-specific installation instructions for Docker here. Our iPython notebooks are compatible with TensorFlow 1.0.

Option A: Use our pre-built Docker image from Docker Hub

  1. After installing Docker, pull a prebuilt image from our Docker Hub by entering:

    docker pull mlatberkeley/showandtell
    

    You will need a Docker Hub account in order to pull the image (get one here). If it's your first time pulling a Docker image from Docker Hub you will need to login to your Docker Hub account from your terminal with docker login, and follow the username and password prompt.

  2. To run the pulled image (after cloning and downloading the repository) enter

    docker run -it -p 8888:8888 -v <path to repo>:/root mlatberkeley/showandtell
    

    where <path to repo> should be the absolute path to your cloned repository. If you followed our Git Basics section the path should be <path to your home directory>/oreilly-captions.

  3. After building, starting, and attaching to the appropriate Docker container, run the provided Jupyter notebooks by entering

    jupyter notebook --ip 0.0.0.0
    

    and navigate to http://0.0.0.0:8888 in your browser.

Option B: Download and build your own Docker image from our GitHub repo

If you want to build a GPU or CPU-based Docker image of your own, you can use the Dockerfiles provided in the /dockerfiles/ subdirectory of our GitHub repo.

  1. After cloning the repo to your machine, enter

    docker build -t showandtell_<image_type> -f ./dockerfiles/Dockerfile.<image_type> ./dockerfiles/
    

    where <image_type> is either gpu or cpu. (Note that, in order to run these files on your GPU, you'll need to have a compatible GPU, with drivers installed and configured properly as described in TensorFlow's documentation.)

  2. Run the Docker image by entering

    docker run -it -p 8888:8888 -v <path to repo>:/root showandtell_<image_type>
    

    where <image_type> is either gpu or cpu, depending on the image you built in the last step.

  3. After building, starting, and attaching to the appropriate Docker container, run the provided Jupyter notebooks by entering

    jupyter notebook --ip 0.0.0.0
    

    and navigate to http://0.0.0.0:8888 in your browser.

Note If you are using Docker Toolbox as opposed to native Docker you will have to navigate to the Daemon IP adress (instead of 0.0.0.0) provided right after starting the Docker Quickstart Terminal (for us this was 192.168.99.100) in order to use Jupyter.

Debugging docker

If you receive an error of the form:

WARNING: Error loading config file:/home/rp/.docker/config.json - stat /home/rp/.docker/config.json: permission denied
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.26/images/json: dial unix /var/run/docker.sock: connect: permission denied

It's most likely because you installed Docker using sudo permissions with a packet manager such as brew or apt-get. To solve this permission denied simply run docker with sudo (ie. run docker commands with sudo docker <command and options> instead of just docker <command and options>).

The Notebooks

There are three notebooks:

  • 1. O'Reilly Training.ipynb - Contains code to train a TensorFlow caption generator from a VGG16 word embedding as described in our article. Note: you must run this notebook's train method before running any of the other notebooks in order to generate a mapping between integers and our vocabulary's words that will be reused in the other notebooks.
  • 2. O'Reilly Generate.ipynb - Contains the same code as 1. O'Reilly Training.ipynb except it introduces functionality to generate captions from an image embedding (as opposed to just being able to train on captions). Functions as a sanity check for the quality of captions we are generating.
  • 3. O'Reilly Generate from image.ipynb - Builds on the previous notebook, except instead of feeding an image embedding to our caption generation model, it first feeds an image to the VGG-16 Convolutional Neural Network to generate an image feature embedding. This gives us an end-to-end pipeline for going from an image to a caption.
  • In order to run the test notebook edit the image path in the ipynb (more details in the .ipynb itself).

Additional Downloads:

In order to run the first two notebooks, you will need VGG-16 image embeddings for the Flickr-30K dataset. These image embeddings are available from our Google Drive.

Additionally, you will need the corresponding captions for these images (results_20130124.token), which can also be downloaded from our Google Drive.

In order to run the 3. O'Reilly Generate from image.ipynb notebook you will need to download a pretrained TensorFlow model for VGG-16 generated from the original Caffe model from the VGG-16 paper.

Place all of these downloads in the ./data/ directory.

Pretrained Weights:

We've trained the caption generator (w/o training VGG-16 End2End) to 500 epochs, and we've placed the resulting checkpoint files in ./models/tensorflow. You should experience an average reconstruction loss of ~1.75-1.85.

More Repositories

1

Creative-Adversarial-Networks

(WIP) Implementation of Creative Adversarial Networks https://arxiv.org/pdf/1706.07068.pdf
Python
222
star
2

Machine-Learning-Decal-Spring-2019

A 2-unit decal run by ML@B's education team
Jupyter Notebook
52
star
3

Machine-Learning-Decal-Fall-2018

Jupyter Notebook
44
star
4

Data-Science-Decal-Fall-2017

Jupyter Notebook
42
star
5

openbrain

Python
34
star
6

scae-pytorch

Stacked Capsule Autoencoders (SCAE) in PyTorch and their semantic interpretation
Python
31
star
7

Machine-Learning-Decal-Spring-2018

Repository for all content for Machine Learning Decal, Spring 2018.
Jupyter Notebook
29
star
8

Deep-Learning-Decal-Fall-2017

26
star
9

deepart-workshop

Making Art with Deep Learning Workshop | ML@B
Jupyter Notebook
26
star
10

deep-learning-reading-group

A public wiki for the deep learning reading group at UC Berkeley
25
star
11

blog

ML@B blog
JavaScript
23
star
12

eeg-ssl

Self-supervised learning for EEG
Python
23
star
13

Meta-Learning-Worskhop

Jupyter Notebook
16
star
14

tensorflow-bootcamp

Repo for Fall 2017 Tensorflow Bootcamp
Jupyter Notebook
13
star
15

intro-dl-workshop

Code for the Introduction to Deep Learning Workshop - Spring 2018
Jupyter Notebook
11
star
16

selfdriving-fa19

Self Driving Car Decal taught by Machine Learning @ Berkeley, fall 2019 at UC Berkeley.
Python
10
star
17

bootcamp

Bootcamp held for Spring 17.
Jupyter Notebook
10
star
18

RL-Workshop

Deep Reinforcement Learning Workshop - Spring 2018
Jupyter Notebook
10
star
19

PyTorch-Workshop

Repo for the ML@B Fall 2017 PyTorch Workshop
Jupyter Notebook
9
star
20

research-papers

Cool deep learning research papers organized by section
7
star
21

workshops

6
star
22

SqueezeDet-Pruning

Ternary Weight Pruning for SqueezeDet
Python
5
star
23

investarget

ML@B - InvesTarget
Jupyter Notebook
5
star
24

selfdriving-sp20

Jupyter Notebook
5
star
25

IntroToTensorFlow

ML@B's Intro to Tensorflow Workshop Spring 2018
Jupyter Notebook
5
star
26

fa19-NMEP

Machine Learning at Berkeley: New Member Education Program Curriculum
Jupyter Notebook
5
star
27

fa21-nmep

Fall '21 NMEP Central Repo
Jupyter Notebook
4
star
28

hyperparameter_sweeper

Automated hyperparameter sweeping.
Python
3
star
29

intro-nlp-workshop-fa18

Jupyter Notebook
3
star
30

GAN-Workshop

Workshop demo code for GANs
Jupyter Notebook
3
star
31

CUDA-intro

Example of how to use cuda
Cuda
3
star
32

indigo

TensorFlow 2 Implementation of the Transformer-InDIGO model
Python
3
star
33

NMEP-sp19

Labs, lectures, notes, homeworks for NMEP sp19
Jupyter Notebook
3
star
34

climatehack2023

Jupyter Notebook
3
star
35

rec-sys-decal-website

Website for the Recommendation Systems in Machine Learning DeCal (Fall '21)
JavaScript
2
star
36

NLP-workshop

Jupyter Notebook
2
star
37

keras_workshop

Code from Calhacks Keras workshop
Python
2
star
38

transformer-autocomplete

Python
2
star
39

calhacks-pytorch

Pytorch Workshop for Calhacks
Python
2
star
40

NNWorkshop

Repository built for a Neural Networks Workshop series!
Python
2
star
41

genetic-algs

Genetic Algorithms for Modeling Experimental Data
Python
2
star
42

slang

Jupyter Notebook
2
star
43

sap

The SAP project utilizes the HANA Vora framework for optimized database operations.
Python
2
star
44

hackathon_chatbot

CSS
1
star
45

sp20-nmep

Homework for New Member Education Program
Jupyter Notebook
1
star
46

DNA-Sequencing

Jupyter Notebook
1
star
47

beowulf

Beowulf cluster related design, script, etc.
Eagle
1
star
48

web

Ruby
1
star
49

reading-list

JavaScript
1
star
50

sp22-nmep

yeet
Python
1
star
51

mlabweb

HTML
1
star
52

h2o-redditcomments

Machine Learning at Berkeley Reddit
R
1
star
53

SP21-NMEP

Jupyter Notebook
1
star
54

tensorflow-workshop-sp17

Code for ML@B Introduction to TensorFlow Workshop
Python
1
star
55

NMEP-fa20

Jupyter Notebook
1
star
56

improve-music-rec

Improving Music Recommendation: Featurizing Audio Before Rendering It
Python
1
star
57

f18-NMEP

Homeworks, readings, and et cetera for Machine Learning at Berkeley Fall 2018 NMEP
Jupyter Notebook
1
star