• Stars
    star
    2
  • Language Makefile
  • Created almost 5 years ago
  • Updated almost 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

collecting crime report data from cities that have it in a granular format

More Repositories

1

watson-word-watcher

A proof of concept using IBM's Speech-to-Text API to do quick-and-dirty transcriptions
Python
309
star
2

journalism-syllabi

Computer-Assisted Reporting and Data Journalism Syllabuses, compiled by Dan Nguyen
Python
165
star
3

abbyy-finereader-ocr-senate

Evaluating the performance and accuracy of ABBYY FineReader's OCR on Senate Financial Disclosure scanned forms
CSS
125
star
4

github-for-portfolios

A layperson's step-by-step guide to building webpages with Github
CSS
73
star
5

python-notebooks-data-wrangling

Python 3.x notebooks about real-world data cleaning and visualization
Jupyter Notebook
68
star
6

facebook-trending-rss-fetcher

Python code to scrape and collect data from the RSS feeds Facebook uses to augment its Trending Section
Python
56
star
7

smalldata_journalism

An online reference for data journalism
Ruby
25
star
8

learn-data-csv-cli

A work-in-progress guide showing how and why you should learn command-line tools (xsv, csvkit) to work with data
Python
19
star
9

bashfoo

My personally curated list of bash/command-line commands and snippets that are very useful yet I keep on forgetting
Python
18
star
10

datajournalism-primer

a general list of resources and articles for people interested in getting into data journalism
HTML
16
star
11

congress-colleges

What fancy schools do U.S. legislators go to?
HTML
15
star
12

gis-geospatial-fun-python3x

Tracking my progress in doing GIS/Geospatial work in Python 3.x
Jupyter Notebook
12
star
13

nicar-2019-pdfplumbing

NICAR 2019 workshop on using Python and PDFplumber to extract text from PDFs
Jupyter Notebook
12
star
14

Congressmiles

A tutorial on using Face.com's and NYT Congress's API + Sunlight data
Ruby
10
star
15

dannguyen.github.io

I'm making a Github Pages repo!
HTML
9
star
16

scrape-senate-financial-disclosures

looking at U.S. Senators' disclosures, including how to parse and track them
HTML
9
star
17

local-news-data

how hard is it to get a list of all local news sites in the United States (LOL)
Python
8
star
18

python-at-stanford

Python Courses at Stanford
8
star
19

NICAR-Google-Refine

The lesson and source files for Dan Nguyen's NICAR 2012 lesson on Google Refine
6
star
20

pdftotablestable

Comparing the programs that extract tabular data from PDFs, e.g. ABBYY FineReader, Tabula, CometDocs
6
star
21

house-financial-disclosures

Scraping House representative financial disclosures
Python
6
star
22

clinton-hillary-email-fbi-investigation-docs

OCR copy of the 2015-2016 FBI Investigation into Hillary Clinton's emails
6
star
23

pydataproject-template

dan's personal reference for properly creating an empty/fresh python-based data wrangling project
Python
5
star
24

padjo-2017-sql-exam

PADJO 2017 SQL Exam - Now with extra election and disbursement data!
Shell
5
star
25

aws-textract-pdf-to-csv-demo

Testing the new AWS Textract when it comes to extracting data tables from PDFs (pdf-to-csv) and whether it can deliver us from our endless torments
5
star
26

nhtsa-complaint-data

Some scripts/data description for NHTSA complaint data
5
star
27

quickdataproject-template

a template I use for quick data project examples where collection, wrangling, and exploration can be done by standalone shell/python scripts
Python
5
star
28

screencappy

A command-line tool for making it easier to create and save screenshots as a blogger
Python
4
star
29

dmv-vanity-plate-rejections

A repo of collected data and records from U.S. state DMVs regarding rejected vanity license plates
HTML
4
star
30

csvkitcat

csvkitcat has been archived (Oct. 2020), and is being carted over to csvmedkit
Python
4
star
31

frozen.analytics.usa.gov

A "frozen" version of https://analytics.usa.gov to practice network traffic inspection and web scraping
CSS
4
star
32

writhub

A simple Python-based static post generator, because I just need to post, not make an entire website
Python
4
star
33

journaling-on-github

My personal repo for doing quick journaling on Github with Markdown, plus some helper TOC scripts
Python
4
star
34

acp-2017-finding-stories-in-data

"How to Find Stories in Data" for the Associated Collegiate Press 2017 San Francisco Midwinter Convention
4
star
35

kfc-scrape

chicken
3
star
36

til

A simple static Jekyll blog of things I've learned, day-to-day, particularly in programming and data journalism
Ruby
3
star
37

altair-dataviz

Visualization in Python with the Altair library. Done in Jupyter Notebooks.
Jupyter Notebook
3
star
38

mechanical-unmurk-ocr

For the OCRing of scanned, murky documents where privacy, speed, accuracy, and cost are all priorities
3
star
39

seeing-is-beliebing

Instagram util for finding photos taken shorty before and after near where another photo was taken
JavaScript
3
star
40

simplestuff-sqlite

A data/lesson repo teaching SQL syntax and concepts with a very simple SQLite database
Shell
3
star
41

smalldata

A list of small datasets for examples of exploration in spreadsheets
Python
3
star
42

cms_medicare_fee_data

Data notebook for CMS Medicare fee data
3
star
43

marktoc

A Python library for generating a table of contents and anchor markup for a Markdown file
Python
3
star
44

sf-shelter-waitlist-daily-snapshots

A compilation of daily snapshots of San Francisco's emergency shelter reservation wait-list during the COVID-19 pandemic
Python
3
star
45

seshkit

seshkit is a command-line tool for creating transcripts from audio files
Python
3
star
46

excsv

goofin around with a command-line utility for quickly inspecting CSV files
Python
3
star
47

merle

A command-line tool for getting meta information from a URL
Python
2
star
48

DepGal

Build out a gal using RMagick
JavaScript
2
star
49

csvviz

please i would like someday a tool that is like csvkit but for making charts from the command line
Python
2
star
50

supcli

supcli: my personal guide to modern CLI, including third-party replacement for classic Nix tools
2
star
51

xkcd-on-reactjs

Just playing around with React.js to make a searchable xkcd archive
Ruby
2
star
52

yearbook

Ruby
2
star
53

ny-gis-cartodb-fun

Examples of GIS with New York data and CartoDB
2
star
54

sf-ethics-lobbyist-sql

A repo of San Francisco lobbyist data compiled into SQLite form, including data-handling scripts
Shell
2
star
55

emojicsv

Machine-readable emotions in machine-readable CSV
HTML
2
star
56

command-line-basics-mz2022

command line lessons for 2022 quickie repo
2
star
57

SCOTUS-Transcript-Viewer

A Backbone.js viewer of SCOTUS transcripts
JavaScript
2
star
58

Shakyspeare

Analyzing the Bard's work with Ruby!
2
star
59

death-data

2
star
60

bts-transstats-t100-domestic-demo

Demo of data processing for BTS transtats
2
star
61

middleman-meta-tags

Meta and SEO tag helpers for Middleman
Ruby
2
star
62

bashappy_helpers

A bunch of helper functions I wrote to use for my own macOS terminal convenience
Shell
2
star
63

air_skift

Air rails
Ruby
2
star
64

secdataexploring

fetching and exploring SEC structured data for fun
Python
2
star
65

dod-leso-1033-data

A repo for collecting data/records regarding the Defense Logistics
Python
2
star
66

matplotlib-styling-tutorial

A quick iPython notebook showing how to create and style Matplotlib charts with roughly same flexibility as ggplot2
Jupyter Notebook
2
star
67

texas-state-salaries

playing around with texas state salary data courtesy of the Texas Tribune
Python
2
star
68

healthcare.gov

A copy of healthcare.gov when it was built on Jekyll, before they removed the source code
JavaScript
2
star
69

jekyll-datasite-template

Trying to make a template that scaffolds a basic jekyll site with bootstrap and vendor d3v5
JavaScript
2
star
70

pgark

pgark (page archiver): Python library and CLI for archiving URLs on popular services like Wayback Machine [alpha, just spitballing]
Python
2
star
71

nature-inspired-algorithms-in-python

Going through Jason Brownlee's "Clever Algorithms: Nature-Inspired Programming Recipes" http://cleveralgorithms.com/nature-inspired/stochastic/random_search.html
Python
1
star
72

lookups-of-note

Lookup tables and data references
1
star
73

censusscout

making my own lightweight version of Census Explorer because y not
JavaScript
1
star
74

motherfuckingwebdesignguide

just do it
1
star
75

foodscrape

A demonstration of scraping health inspection websites and doing statistical analysis
1
star
76

nicar-2019-github-intro

Intro to git and github for journalists
Makefile
1
star
77

Sinatra-Fun

Testing out sinatra
Ruby
1
star
78

jekyll-bootstrap-starter

a basic jekyll theme that sits atop of Bootstrap 4.x. For my convenience only
HTML
1
star
79

data-wrangling-fakebook

The Little Data Wrangling Fakebook
Python
1
star
80

foiastories

a curated list of interesting foia/foil requests
1
star
81

astronautdata

A repo of astronaut data
HTML
1
star
82

danssphinx-template

This is a bunch of examples of things I forget how to do in Sphinx and reST
Python
1
star
83

sql2md

A bash script for converting SQLite query into Markdown-ready-pastable results
Shell
1
star
84

poynter-census-data-2019

Poynter Census Data Workshop 2019, using Sphinx-hieroglyph slidemaker
Python
1
star
85

stanford-public-affairs-data-journalism

1
star
86

sf-evictions

just collecting san francisco evictions data
Python
1
star
87

d3choro-template

yaddaydaydayda
CSS
1
star
88

merde

Shit
1
star
89

digital-jo-2017

Quickie repo for digital journalism notes for stanford journalism 2017
1
star
90

twitkit

yet another attempt at making a personal twitter data exploration command-line tool
Python
1
star
91

wire-glossary

the fuck did I do
Ruby
1
star
92

high-charty

JavaScript
1
star
93

wikipedia-trends

1
star
94

revelecture

A command-line tool to turn Markdown files into Reveal.js powered slideshows
JavaScript
1
star
95

hello-svelte

need to practice this javascript thing
HTML
1
star
96

ok-earthquakes-RNotebook

Using R's ggplot2 and rgdal to examine earthquake activity in Oklahoma
R
1
star
97

fatal-encounters-and-census-sql

SQLite database exercises for analyzing Fatal Encounters (police officer involved homicides) and Census data
Shell
1
star
98

python-audio-playtime

experimenting with Python audio visualizers and extraction libraries
1
star
99

scrapespeare

A collection of The Bard's text for basic programming exercises and data mining.
XSLT
1
star
100

twitch-stream-exploring-ppp-with-cli

Just some notes and data and files for a twitch stream on how to data wrangle the PPP loan data
1
star