DIVE Backend
The Data Integration and Visualization Engine (DIVE) is a platform for semi-automatically generating web-based, interactive visualizations of structured data sets. Data visualization is a useful method for understanding complex phenomena, communicating information, and informing inquiry. However, available tools for data visualization are difficult to learn and use, require a priori knowledge of what visualizations to create. See dive.media.mit.edu for more information.
Development setup involves the following steps:
- Installing system dependencies
- Setting up postgres
- Setting up rabbitMQ
- Starting and entering virtual environment
- Installing python dependencies
- Migrating database
- Starting celery worker
- Starting server
Install System Dependencies (Linux / apt)
$ sudo apt-get update && sudo apt-get install -y postgresql git python2.7 python-pip build-essential python-dev libpq-dev libssl-dev libffi-dev liblapack-dev gfortran libxml2-dev libxslt1-dev rabbitmq-server
Install System Dependencies (Mac / brew)
Install Homebrew if you don't already have it. Then, run the following code:
$ brew install postgres
$ brew install rabbitmq
OR Install postgres.app
Install postgres.app by following the instructions here: (http://postgresapp.com/).
Download and open the app to start postgres.
Setup postgres
Make sure that you have a postgres server instance running:
postgres -D /usr/local/pgsql/data >logfile 2>&1 &
sudo -u postgres -i
Create the dive database by running:
$ createuser admin -P
$ createdb dive -O admin
Start RabbitMQ AMQP Server
-
Add rabbitmq-server executable to path (add
PATH=$PATH:/usr/local/sbin
to ~/.bash_profile or ~/.profile) -
Run the server as a background process
sudo rabbitmq-server -detached
-
Create a RabbitMQ user and virtual host:
$ sudo rabbitmqctl add_user admin password
$ sudo rabbitmqctl add_vhost dive
$ sudo rabbitmqctl set_permissions -p dive admin ".*" ".*" ".*"
Install and Enter Virtual Python Environment
- Installation: See this fine tutorial.
- Starting virtual env:
source venv/bin/activate
.
Install Python Dependencies
Within a virtual environment, install dependencies in requirements.txt
. But due to a dependency issue in numexpr, we need to install numpy first.
$ pip install -U numpy && pip install -r requirements.txt
Start Celery Worker
- Start celery worker:
./run_worker.sh
- Start celery monitor (flower):
celery -A base.core flower
Database Migrations
Follow the docs. The first time, run the migration script.
python migrate.py db init
Then, review and edit the migration script. Finally, each time models are changed, run the following:
$ python migrate.py db migrate
$ python migrate.py db upgrade
Run API
- To run development Flask server, run
python run_server.py
. - To run production Gunicorn server, run
./run_server.sh
.
Deployment
- Set environment variable before running any command:
$ source production_env
Building Docker Images
conda env export > environment.yml
conda env create -f environment.yml
conda list -e > conda-requirements.txt
conda create --name dive --file conda-requirements.txt