• This repository has been archived on 12/Jul/2021
  • Stars
    star
    231
  • Rank 173,434 (Top 4 %)
  • Language
    Python
  • License
    MIT License
  • Created over 14 years ago
  • Updated about 11 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A Python version (almost a port) of ProPublica's TableFu

Python TableFu is a tool for manipulating spreadsheet-like tables in Python. It began as a Python implementation of ProPublica's TableFu, though new methods have been added. TableFu allows filtering, faceting and manipulating of data. Going forward, the project aims to create something akin to an ORM for spreadsheets.

Usage:

>>> from table_fu import TableFu
>>> table = TableFu.from_file('tests/test.csv')
>>> table.columns
['Author', 'Best Book', 'Number of Pages', 'Style']

# get all authors
>>> table.values('Author')
['Samuel Beckett', 'James Joyce', 'Nicholson Baker', 'Vladimir Sorokin']

# total a column
>>> table.total('Number of Pages')
1177.0

# filtering a table returns a new instance
>>> t2 = table.filter(Style='Modernism')
>>> list(t2)
[<Row: Samuel Beckett, Malone Muert, 120, Modernism>,
 <Row: James Joyce, Ulysses, 644, Modernism>]


# each TableFu instance acts like a list of rows
>>> table[0]
<Row: Samuel Beckett, Malone Muert, 120, Modernism>

list(table.rows)
[<Row: Samuel Beckett, Malone Muert, 120, Modernism>,
 <Row: James Joyce, Ulysses, 644, Modernism>,
 <Row: Nicholson Baker, Mezannine, 150, Minimalism>,
 <Row: Vladimir Sorokin, The Queue, 263, Satire>]

# rows, in turn, act like dictionaries
>>> row = table[1]
>>> print row['Author']
James Joyce

# transpose a table
>>> t2 = table.transpose()
>>> list(t2)
[<Row: Best Book, Malone Muert, Ulysses, Mezannine, The Queue>,
 <Row: Number of Pages, 120, 644, 150, 263>,
 <Row: Style, Modernism, Modernism, Minimalism, Satire>]

>>> t2.columns
['Author',
 'Samuel Beckett',
 'James Joyce',
 'Nicholson Baker',
 'Vladimir Sorokin']

# sort rows
>>> table.sort('Author')
>>> table.rows
[<Row: James Joyce, Ulysses, 644, Modernism>,
 <Row: Nicholson Baker, Mezannine, 150, Minimalism>,
 <Row: Samuel Beckett, Malone Muert, 120, Modernism>,
 <Row: Vladimir Sorokin, The Queue, 263, Satire>]

# sorting is stored
table.options['sorted_by']
{'Author': {'reverse': False}}

# which is handy because...

# tables can also be faceted (and options are copied to new tables)
>>> for t in table.facet_by('Style'):
...     print t.faceted_on
...     t.table
Minimalism
[['Nicholson Baker', 'Mezannine', '150', 'Minimalism']]
Modernism
[['Samuel Beckett', 'Malone Muert', '120', 'Modernism'],
 ['James Joyce', 'Ulysses', '644', 'Modernism']]
Satire
[['Vladimir Sorokin', 'The Queue', '263', 'Satire']]

Here's an advanced example that uses faceting and filtering to produce aggregates from this spreadsheet (extracted from the New York Times Congress API).

Formatting

Filters are just functions that take a value and some number of positional arguments. New filters can be registered with the included Formatter class.

>>> from table_fu.formatting import Formatter
>>> format = Formatter()
>>> def capitalize(value, *args):
...     return str(value).capitalize()
>>> format.register(capitalize)
>>> print format('foo', 'capitalize')
Foo

Cells can be formatted according to rules of the table (which carry over if the table is faceted):

>>> table = TableFu(open('tests/sites.csv'))
>>> table.columns
['Name', 'URL', 'About']
>>> table.formatting = {
... 'Name': {'filter': 'link', 'args': ['URL']}
... }
>>> print table[0]['Name']
<a href="http://www.chrisamico.com" title="ChrisAmico.com">ChrisAmico.com</a>

HTML Output

TableFu can output an HTML table, using formatting you specify:

>>> table = TableFu(open('tests/sites.csv'))
>>> table.columns
['Name', 'URL', 'About']
>>> table.formatting = {'Name': {'filter: 'link', 'args': ['URL']}}
>>> table.columns = 'Name', 'About'
>>> print table.html()
<table>
<thead>
<tr><th>Name</th><th>About</th></tr>
</thead>
<tbody>
<tr id="row0" class="row even"><td class="datum"><a href="http://www.chrisamico.com" title="ChrisAmico.com">ChrisAmico.com</a></td><td class="datum">My personal site and blog</td></tr>
<tr id="row1" class="row odd"><td class="datum"><a href="http://www.propublica.org" title="ProPublica">ProPublica</a></td><td class="datum">Builders of the Ruby version of this library</td></tr>
<tr id="row2" class="row even"><td class="datum"><a href="http://www.pbs.org/newshour" title="PBS NewsHour">PBS NewsHour</a></td><td class="datum">Where I spend my days</td></tr>
</tbody>
</table>

More Repositories

1

python-frontmatter

Parse and manage posts with YAML (or other) frontmatter
Python
329
star
2

geocode-sqlite

Geocode rows in a SQLite database table
Python
231
star
3

awesome-journalism

A collection of awesome tools for journalism
96
star
4

propublica-congress

A Python client for the ProPublica Congress API
Python
53
star
5

feed-to-sqlite

Save an RSS or ATOM feed to a SQLite database
Python
46
star
6

python-wordpress

A really simple Python client for WordPress JSON API
Python
35
star
7

ulysses-js

A tool for telling stories with maps.
JavaScript
24
star
8

datasette-geojson-map

Render a map for any query with a geometry column
Python
23
star
9

visible-data

Cultural learnings of dataviz to make benefit glorious profession of journalism.
JavaScript
21
star
10

spatial-data-cooking-show

A demo project and template repository showing how I use SpatiaLite with Datasette for quick spatial analysis.
Makefile
16
star
11

self-hosted-maps-codespace

An example self-hosted map with all dependencies included
Makefile
16
star
12

datasette-query-files

Write Datasette canned queries as plain SQL files
Python
13
star
13

datasette-geojson

Add GeoJSON output to Datasette queries
Python
12
star
14

geocoder-comparison

A test of various geocoders available on the web
Python
11
star
15

python-nytcongress

Another Python client for the NY Times' Congress API
Python
10
star
16

alltheplaces-datasette

AllThePlaces in Datasette
Makefile
9
star
17

python-publish2

Publish2 is a tool for collaborative journalism, letting users create and distribute feeds of topical news links. This is the beginnings of a Python wrapper around Publish2's JSON feeds.
Python
7
star
18

simple-sunlight

A simpler wrapper for Sunlight's Congress API
Python
7
star
19

flask-docviewer

A really simple DocumentCloud viewer built on Python and Flask
Python
5
star
20

things-i-use

An opinionated list of what I reach for first on new projects
5
star
21

climate-change

The heat is on for the planet as a whole, but what has been happening where you live?
JavaScript
5
star
22

django-newsutils

A suite of simple, reusible tools for news sites
Python
5
star
23

data-loading-kit

A starter kit for loading data
Python
4
star
24

flask-tablesetter

A Python version of ProPublica's TableSetter, using Flask
JavaScript
4
star
25

tweetbill

Find legislators. Track bills. Take action.
Python
4
star
26

sqlite-colorbrewer

A custom function to use ColorBrewer scales in SQLite queries
Python
4
star
27

python-metalsmyth

A file processor, maybe a static site generator, inspired by Metalsmith.io
Python
4
star
28

chrisamico.com

My personal site. May include a blog.
Python
3
star
29

tennis-rankings

Scrapers for professional tennis rankings (ATP and WTA)
JavaScript
3
star
30

newsbot

A better news aggregator for DC
Python
3
star
31

walldrawings

Sol LeWitt's wall drawings, as implemented for the internet
CSS
3
star
32

dorchester

A toolkit for making dot-density maps in Python
Jupyter Notebook
3
star
33

til

Today I Learned
3
star
34

django-scrivo

Building myself a better, simpler blog engine
Python
3
star
35

srccon-self-hosted-maps

A codespace to work on self-hosted maps at SRCCon 2024
Makefile
3
star
36

nicar24-self-hosted-maps

Slides for my NICAR2024 talk
JavaScript
3
star
37

fedblogger

FedBlogger is an aggregator for federal government blogs
Python
3
star
38

bank-failures

Give me a heads up if there's a new bank failure
2
star
39

hello-congress

A demo app using the New York Times' Congress API and Flask
JavaScript
2
star
40

largest-fires-2018

The 10 largest fires in 2018
Makefile
2
star
41

ft-builder

A prototype Fusion Tables layer builder
JavaScript
2
star
42

jquery-winerlinks

Paragraph-level permalinks in one step
JavaScript
2
star
43

scorekeeper

A little app to keep score in games
HTML
2
star
44

nameparse

A little web service to parse names
Python
2
star
45

hw-maps

Homicide Watch mapping framework built on an open stack
JavaScript
2
star
46

boston-trees

Trees in Boston
Makefile
2
star
47

price-of-things

The prices of things in the news
JavaScript
2
star
48

baltimore-trees

A demo project for NICAR24 in Baltimore
Python
2
star
49

talks

Slides for talks, all in one repo
HTML
1
star
50

wildfires

1
star
51

wumb-to-sqlite

Scrape WUMB playlists to SQLite
HTML
1
star
52

wedding

I'm getting married next year. This is where I'm putting the code for a simple site we're using. It's not very reusable, but we only plan to use it once.
Python
1
star
53

mustachio

An excuse to stay up late playing with mustache templates
JavaScript
1
star
54

guess-mass

A game to learn Massachusetts towns
HTML
1
star
55

hw-partners

JavaScript
1
star
56

python-alchemy

A really basic wrapper for Alchemy's text extraction API
Python
1
star
57

politicsinquotes

Python
1
star
58

classroulette

Spin the wheel. Maybe you'll learn something.
JavaScript
1
star
59

beijing_air

Keeping tabs on the air in Beijing
Python
1
star
60

backbone-opened-captions

A set of Backbone base classes for use with OpenedCaptions.
JavaScript
1
star
61

responsive-dataviz

Slides from my panel at #nicar14
JavaScript
1
star
62

ma-redistricting-2022

Let's play with redistricting data
Makefile
1
star
63

srccon-self-hosted-maps-slides

Slides for SRCCon 2024
HTML
1
star
64

nicar-2020-three-kinds-of-code

The three kinds of code you'll write in the newsroom. My NICAR20 lightning talk.
HTML
1
star
65

geojson-speed-test

What's the fastest way to load GeoJSON into SQLite?
Shell
1
star
66

dc-gis-data

DC Boundary Service Data
Python
1
star