• Stars
    star
    358
  • Rank 118,120 (Top 3 %)
  • Language
    Python
  • Created over 11 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The Django source code for the GovTrack.us website.

GovTrack website frontend

This repo contains the source code of the front-end for www.GovTrack.us. The data-gathering scripts are elsewhere.

Local Development

Development using Vagrant

GovTrack.us is based on Python 3 and Django 2.1 and runs on Ubuntu 18.04 or OS X. To simplify local development, we have a Vagrantfile in this directory. You can get started quickly simply by installing Vagrant and running:

# Get this repo (you must clone with `--recursive`)
git clone --recursive https://github.com/govtrack/govtrack.us-web.git

# Change to this repo's directory.
cd govtrack.us-web

# Start Vagrant.
vagrant up

# Create your initial user.
vagrant ssh -- -t ./manage.py createsuperuser

# Start debug server.
vagrant ssh -- -t ./manage.py runserver 0.0.0.0:8000

# Visit the website in your browser at http://localhost:8000!

# Stop the virtual machine when you are done.
vagrant suspend

# Destroy the virtual machine when you no longer are working on GovTrack ever again (or when you want your disk space back).
vagrant destroy

Even though the site is running in the virtual machine, it is using the source files on your host computer. So you can open up the files that you got from this repository in your favorite text editor like normal and the virtual machine will see your changes. When you edit .py files, runserver will automatically restart to re-load the code. The site's database and search indexes are also stored on the host machine so they will be saved even when you destroy your vagrant box.

See further down about configuration.

Development without Vagrant

To set up GovTrack development without a virtual machine, get the source code in this repository (use --recursive, as mentioned above), and then you'll need to follow along with the steps in our Vagrantfile by just looking at what we did and doing the same on your command line.

At the end:

# Create your initial user.
./manage.py createsuperuser

# Start the debug server.
./manage.py runserver

Getting test data

The Vagrantfile automatically loads current legislator information from the live site. The site draws on about a dozen different data sources.

Bills & votes

To get bill and vote data, you'll need to run the "congress" project scrapers.

If you used Vagrant, use vagrant ssh to go into the virtual machine. Otherwise, perform these steps in this project's main directory:

sudo apt install python-dev libxml2-dev libxslt1-dev libz-dev python-pip # see congress project deps
git clone https://github.com/unitedstates/congress congress-project
cd congress-project/
pip2 install -r requirements.txt 
python2 run govinfo --bulkdata=BILLSTATUS --congress=116
python2 run govinfo --collections=BILLS --extract=pdf --years=2020 
python2 run bills --log=debug --govtrack
python2 run votes --log=debug --govtrack
cd ..
mkdir -p local
cho "CONGRESS_DATA_PATH=congress-project/data" >> local/settings.env
mkdir -p data/historical-committee-membership
echo "<stub/>" > data/historical-committee-membership/116.xml
./parse.py bill
./parse.py vote

Configuration

Some features of the site require additional configuration. To set configuration variables, create a file named local/settings.env and set any of the following optional variables (defaults are shown where applicable):

# Database server.
# See https://github.com/kennethreitz/dj-database-url
DATABASE_URL=sqlite:///local/database.sqlite...

# Memcached server.
# See https://github.com/ghickman/django-cache-url#supported-caches
CACHE_URL=locmem://opendataiscool

# Search server.
# See https://github.com/simpleenergy/dj-haystack-url#url-schema
#
# For local development you may want to use the (default) Xapian search engine, e.g.:
# xapian:/home/username/govtrack.us-web/xapian_index_person
# You'll need to `apt-get install python3-xapian` and `pip install xapian-haystack`
# or see https://github.com/notanumber/xapian-haystack.
#
# For a production deployment you may want to use Solr instead, e.g.:
# solr:http://localhost:8983/solr/person
#
# You can also specify 'simple' to have a dummy search backend that
# does not actually index or search anything.
HAYSTACK_PERSON_CONNECTION=xapian:local/xapian_index_person
HAYSTACK_BILL_CONNECTION=xapian:local/xapian_index_bill

# Django uses a secret key to provide cryptographic signing. It should be random
# and kept secure. You can generate a key with `./manage.py generate_secret_key`
SECRET_KEY=(randomly generated on each run if not specified)

See settings.env.template for details, especially for values used in production.

Additionally, some data files are stored in separate repositories and must be obtained and the path configured in settings.env:

  • congress project bill status data (etc.)
  • congress-legislators data
  • legislator photos (static/legislator-photos is symlinked to ../data/legislators-photos/photos, so this must go in data for now)
  • GovTrack's misconduct and name pronuciation repositories

Credits

Emoji icons by http://emojione.com/developers/.

Production Deployment Notes

Additional package installation notes are in the Vagrantfile.

You'll need a data directory that contains:

  • analysis (the output of our data analyses)
  • congress (a symbolic link to the congress project's data directory, holding bill and legislator data, some of which can't be reproduced because the source data is gone; also set CONGRESS_DATA_PATH=data/congress in local/settings.env)
  • congress-bill-text-legacy (a final copy of HTML bill text scraped from the old THOMAS.gov, for bills before XML bill text started)
  • historical-committee-membership (past committee membership, snapshots of earlier data)
  • legislator-photos (manually collected photos of legislators; there's a symbolic link from static/legislator-photos to legislator-photos/photos)

You'll need several other data repositories that you can put in the data directory if you don't expose the whole directory over HTTP, but they can also be placed anywhere because the paths are in settings:

At this point you should be able to run ./manage.py runserver and test that the site works.

And conf/uwsgi_start test 1 should start the uWSGI application daemon.

Install nginx, supervisord (which keeps the uWSGI process running), and certbot and set up their configuration files:

apt install nginx supervisor certbot python3-certbot-nginx
rm /etc/nginx/sites-enabled/default
ln -s /home/govtrack/web/conf/nginx.conf /etc/nginx/sites-enabled/www.govtrack.us.conf
ln -s /home/govtrack/web/conf/supervisor.conf /etc/supervisor/conf.d/govtrack.conf
# install a TLS certificate at /etc/ssl/local/ssl_certificate.{key,crt} (e.g. https://gist.github.com/JoshData/49eff618f84ce4890697d65bcb740137)
mkdir /var/cache/nginx/www.govtrack.us
service nginx restart
service supervisor restart
certbot # and follow prompts, but without the HTTP redirect because we already have it

To scrape and load new data, you'll need to set up the congress project.

  • Clone the congress project repo anywhere and install it into th website's Python virtual environment (pip install ../path/to/congress).
  • Make a new working directory that the scripts will be run in. Set that directory as CONGRESS_PROJECT_PATH in GovTrack's local/settings.env.
  • Symlink the data/congress data directory as the data directory inside the congress project working directory.
  • Clone the congress-legislators project as a subdirectory of the congress project working directory and follow its installation steps to create a separate Python 3 virtualenv for its scripts in its scripts/.env directory.
  • Try launching the scrapers from the GovTrack directory: ./run_scrapers.py people, ./run_scrapers.py committees, etc.
  • Enable the crontab.

The crontab sends the outputs of the commands to Josh, so the server needs a sendmail-like command. The easiest to set up is msmtp, like so:

apt install msmtp-mta
cat > /etc/msmtprc <<EOF;
account default
auth on
tls on
tls_trust_file /etc/ssl/certs/ca-certificates.crt
host *******
port 587
from #######@govtrack.us
user *******
password *******
EOF

More Repositories

1

misconduct

A database of misconduct and alleged misconduct by Members of the United States Congress.
Python
48
star
2

django-registration-pv

A reusable Django app for new user registration with support for OpenID/OAuth login
Python
24
star
3

boundaries_us

ARCHIVE. A full Django deployment for represent-boundaries and represent-maps with definitions for U.S.-specific data files
Python
22
star
4

django-html-emailer

A utility app for sending HTML emails in Django 1.7+:
Python
10
star
5

advocacy-organization-scorecards

Congressional Scorecards Data
Python
8
star
6

represent-maps

ARCHIVED. Colorful map tile layers based on represent-boundaries.
Python
8
star
7

phonecongress.com

The website at phonecongress.com.
Python
6
star
8

legacy-scrapers

ARCHIVAL - Old Perl scrapers for collecting U.S. legislative information.
Perl
6
star
9

impeachment.guide

HTML
6
star
10

legislator-proxies

Python
5
star
11

pronunciation

Prononciation Guide for Names of Members of Congress
Python
4
star
12

american-memory

Metadata for the Library of Congress's American Memory site, curated and corrected for better sustainability and accessibility.
Python
3
star
13

legacy-conversion

ARCHIVE. Scripts for converting GovTrack to use data from @unitedstates.
Python
2
star
14

django-lorien-common

A fork of https://bitbucket.org/lorien/django-common by @lorien. Django-common provides useful shortcuts for developing django projects.
Python
2
star
15

govtrack-pitch-deck

HTML
2
star
16

django-simplegetapi

The world needs an even simpler Django app for making a read-only API.
Python
2
star
17

twostream

Django 1.7+ middleware that makes it easy to mark certain pages as being cachable at the HTTP server level while still being able to fetch user-specific content through AJAX.
Python
1
star
18

civicimpulse.com

The website at civicimpulse.com.
HTML
1
star
19

govtrack-insider

The homepage for GovTrack Insider.
HTML
1
star
20

govtrack-website-2010ish

The GovTrack.us website from around 2008-2011 (.NET/Mono/XSLT framework and Perl map tile backend).
C#
1
star
21

civic-impulse-llc

Documents related to Civic Impulse, LLC, the company behind GovTrack.us.
1
star
22

django-trackevents

An event subscription framework for Django used by @GovTrack.
Python
1
star