• Stars
    star
    21
  • Rank 1,084,038 (Top 22 %)
  • Language
    Python
  • License
    MIT License
  • Created over 9 years ago
  • Updated almost 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Project on the history of genre.

More Repositories

1

DataMunging

Scripts that clean up OCR and munge Hathi metadata.
Python
69
star
2

fictional-time-with-GPT4

An experiment replicating part of "Why Literary Time is Measured in Minutes" with GPT-4.
Jupyter Notebook
30
star
3

noveltmmeta

Code and data supporting "NovelTM Data Sets for English-Language Fiction."
Jupyter Notebook
22
star
4

paceofchange

Code and data to support the article, "How quickly do literary standards change?"
Python
21
star
5

LIS590DSH

Jupyter Notebook
19
star
6

BrowseLDA

R scripts that browse the results of LDA
R
19
star
7

character

Data and code for analyzing language associated with fictional characters.
Jupyter Notebook
15
star
8

genredistance

Exploring textual and social measures of distance between genres.
Jupyter Notebook
14
star
9

LDA

A Java package that does basic LDA, without hyperparameter optimization. Folder settings are local. Ymmv.
Java
13
star
10

plot

Initial exploratory research on patterns of change across narrative time.
Jupyter Notebook
10
star
11

genre

Code for Understanding Genre in a Collection of a Million Volumes.
HTML
10
star
12

horizon

Data and code to support Distant Horizons (University of Chicago Press, 2019).
Jupyter Notebook
10
star
13

nehuncertainty

Code used in "Broadening Access to Text Analysis by Describing Uncertainty."
Jupyter Notebook
7
star
14

period-cohort

Code and data for an experiment on the relation between individual change and cohort succession in literary history.
HTML
6
star
15

bayes-bestsellers

Code and data to support "Bestsellers and Critical Favorites 1850-1949," a paper at CA2017.
Jupyter Notebook
6
star
16

reviews

Parsing periodical indexes and finding book reviews, 1800-2007.
Python
5
star
17

is417

IS 417, Data Science in the Humanities.
Jupyter Notebook
5
star
18

changepoint

Measuring the scale and significance of changes *in the pace of change* in an auto-correlated multivariate time series.
Jupyter Notebook
5
star
19

ocreval

Python modules that evaluate OCR quality.
Python
5
star
20

badpublicity

A presentation at MLA 2020 in Seattle, "No Such Thing as Bad Publicity: Toward a Distant Reading of Reception."
Python
5
star
21

hathimetadata

Metadata for English-language fiction and poetry beyond 1923 in HathiTrust Digital Library.
Python
4
star
22

Java-OCR-spellchecking.

Java
4
star
23

Tokenizer

Python scripts for tokenizing text files
Python
4
star
24

measureperspective

Code and data to support "Machine Learning and Human Perspective."
Jupyter Notebook
4
star
25

Parallel-LDA

Java package that partitions a corpus and runs LDA in parallel on it
Java
3
star
26

riseandfall

Code and data supporting The Rise and Fall of Genre Differentiation in English-Language Fiction.
Python
3
star
27

moments

Data and code to support "Why Is Literary Time Measured in Minutes?"
Jupyter Notebook
3
star
28

meta2018

A temporary workspace for novelTM metadata reviewed and analyzed in summer 2018.
Jupyter Notebook
3
star
29

GenreProject

Code and documentation associated with "Understanding Genre in a Collection of a Million Volumes"
Python
3
star
30

JDH-scripts

R
2
star
31

collator

Python scripts for collating HathiTrust page files.
Python
2
star
32

pmla-scripts

Data for 1924-2006 pmla model, plus scripts to turn into Gephi network.
R
2
star
33

noise

Data and code for measuring consequences of noise in digital libraries.
Python
2
star
34

asymmetry

Research on information-theoretic asymmetries in literary history.
Jupyter Notebook
2
star
35

avant

Was the avant-garde really ahead of its time?
Jupyter Notebook
2
star
36

oralarg

Code and results related to oral argument in the Supreme Court. Work in progress: Tonja Jacobi, Matthew Sag, and Ted Underwood.
Jupyter Notebook
1
star
37

overlappingcategories

Python 3 code for training models in a multilabel environment where classes overlap. Based on code in the fiction repo, but with bug fixes and improvements.
Python
1
star
38

Tokenize

folder storing current rulesets, scripts, and metadata for tokenizing / collection building
Python
1
star
39

pages

Java code for mapping genres at the page level in a large collection. Originally based on pagelevelHMM.
Java
1
star
40

20cgenres

Code and data used for page-level mapping of literary genres beyond 1923.
Python
1
star
41

roles

Code for a topic modeling variant that allows for character level 'roles' as well as book-level 'themes.'
Python
1
star
42

time

Further research on narrative pace.
Jupyter Notebook
1
star
43

metadatapredictor

Java code that uses existing metadata to train classifiers that then make predictions for cases where metadata is missing / suspected.
Java
1
star