• Stars
    star
    187
  • Rank 206,464 (Top 5 %)
  • Language
    Python
  • Created over 7 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

☁️ πŸ‘€ πŸ’¬ Visual Chatbot

Visual Chatbot

Demo for the paper (Now upgraded to Pytorch, for the Lua-Torch version please see commit).

Visual Dialog (CVPR 2017 Spotlight)
Abhishek Das, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, JosΓ© M. F. Moura, Devi Parikh, Dhruv Batra
Arxiv Link: arxiv.org/abs/1611.08669
Live demo: http://visualchatbot.cloudcv.org

Visual Chatbot

Introduction

Visual Dialog requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content. Given an image, dialog history, and a follow-up question about the image, the AI agent has to answer the question. Putting it all together, we demonstrate the first β€˜visual chatbot’!

What has changed since the last version?

The model-building code is completely shifted to Pytorch, we have put in a much improved Bottom Up Top Down captioning model from Pythia and Mask-RCNN feature extractor from maskrcnn-benchmark. The Visdial model is borrowed from visdial-challenge-starter code.

Please follow the instructions below to get the demo running on your local machine. For the previous version of this repository which supports Torch-Lua based models see commit.

Setup and Dependencies

Start with installing the Build Essentials , Redis Server and RabbiMQ Server.

sudo apt-get update

# download and install build essentials
sudo apt-get install -y git python-pip python-dev
sudo apt-get install -y autoconf automake libtool 
sudo apt-get install -y libgflags-dev libgoogle-glog-dev liblmdb-dev
sudo apt-get install -y libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler

# download and install redis-server and rabbitmq-server
sudo apt-get install -y redis-server rabbitmq-server
sudo rabbitmq-plugins enable rabbitmq_management
sudo service rabbitmq-server restart 
sudo service redis-server restart

Environment Setup

You can use Anaconda or Miniconda to setup this code base. Download and install Anaconda or Miniconda distribution based on Python3+ from their downloads page and proceed below.

# clone and download submodules
git clone https://github.com/Cloud-CV/visual-chatbot.git
git submodule update init --recursive

# create and activate new environment
conda create -n vischat python=3.6.8
conda activate vischat

# install the requirements of chatbot and visdial-starter code
cd visual-chatbot/
pip install -r requirements.txt

Downloads

Download the BUTD, Mask-RCNN and VisDial model checkpoints and their configuration files.

sh viscap/download_models.sh

Install Submodules

Install Pythia to use BUTD captioning model and maskrcnn-benchmark for feature extraction.

# install fastText (dependency of pythia)
cd viscap/captioning/fastText
pip install -e .

# install pythia for using butd model
cd ../pythia/
sed -i '/torch/d' requirements.txt
pip install -e .

# install maskrcnn-benchmark for feature extraction
cd ../vqa-maskrcnn-benchmark/
python setup.py build
python setup.py develop
cd ../../../

Cuda Installation

Note: CUDA and cuDNN is only required if you are going to use GPU. Download and install CUDA and cuDNN from nvidia website.

NLTK

We use PunktSentenceTokenizer from nltk, download it if you haven't already.

python -c "import nltk; nltk.download('punkt')"

Let's run this now!

Setup the database

# create the database
python manage.py makemigrations chat
python manage.py migrate

Run server and worker

Launch two separate terminals and run worker and server code.

# run rabbitmq worker on first terminal
# warning: on the first-run glove file ~ 860 Mb is downloaded, this is a one-time thing
python worker_viscap.py

# run development server on second terminal
python manage.py runserver

You are all set now. Visit http://127.0.0.1:8000 and you will have your demo running successfully.

Issues

If you run into incompatibility issues, please take a look here and here.

Model Checkpoint and Features Used

Performance on v1.0 test-std (trained on v1.0 train + val):

Model R@1 R@5 R@10 MeanR MRR NDCG
lf-gen-mask-rcnn-x101-demo 0.3930 0.5757 0.6404 18.4950 0.4863 0.5967

Extracted features from VisDial v1.0 used to train the above model are here:

Note: Above features have key image_id (from earlier versions) renamed as image_ids.

Cite this work

If you find this code useful, consider citing our work:

@inproceedings{visdial,
  title={{V}isual {D}ialog},
  author={Abhishek Das and Satwik Kottur and Khushi Gupta and Avi Singh
    and Deshraj Yadav and Jos\'e M.F. Moura and Devi Parikh and Dhruv Batra},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2017}
}

Contributors

License

BSD

Credits and Acknowledgements

More Repositories

1

EvalAI

☁️ πŸš€ πŸ“Š πŸ“ˆ Evaluating state of the art in AI
Python
1,753
star
2

Fabrik

🏭 Collaboratively build, visualize, and design neural nets in browser
Python
1,120
star
3

object-proposals

Repository containing wrapper to obtain various object proposals easily
MATLAB
173
star
4

Origami

πŸ”“ πŸ”‘ πŸ” Origami: Artificial Intelligence as a Service
JavaScript
169
star
5

GSoC-Ideas

CloudCV GSoC Ideas
HTML
119
star
6

Grad-CAM

🌈 πŸ“· Gradient-weighted Class Activation Mapping (Grad-CAM) Demo
HTML
108
star
7

py-cloudcv

Python APIs for CloudCV
Python
98
star
8

diverse-beam-search

πŸ” :shipit: Decoding Diverse Solutions from Neural Sequence Models
Lua
75
star
9

EvalAI-Starters

How to create a challenge on EvalAI?
Python
73
star
10

VQA

CloudCV Visual Question Answering Demo
Lua
66
star
11

CloudCV

☁️ CloudCV Website
JavaScript
64
star
12

evalai-cli

☁️ πŸš€ Official EvalAI Command Line Tool
Python
55
star
13

CloudCV-Old

CloudCV - Large-Scale Distributed Computer Vision As A Cloud Service
Python
51
star
14

vilbert-multi-task

πŸ‘€ πŸ—£οΈ πŸ“12-in-1: Multi-Task Vision and Language Representation Learning Web Demo
Python
35
star
15

EvalAI-ngx

Revamped codebase of EvalAI Frontend
TypeScript
33
star
16

mat-cloudcv

CloudCV API's for Matlab
Java
21
star
17

origami-lib

Python package for origami
Python
17
star
18

GCI

πŸ‘§ πŸ‘¦ Google Code-In Website http://gci.cloudcv.org
HTML
6
star
19

origami-daemon

A long running daemon for Origami to deploy and manage demos on CloudCV servers.
Python
4
star
20

VQA-Challenge

Evaluation Script for VQA Real Image Challenge (Open-Ended) 2017
Python
4
star
21

Dockerfiles

Origami dockerfiles.
Shell
1
star