There are no reviews yet. Be the first to send feedback to the community and the maintainers!
text-matcher
A simple text reuse detection CLI tool.chapterize
A simple tool for splitting up an ebook into its chapters. Works well with Project Gutenberg texts. May also be used to clean up books for computational text analysis.course-computational-literary-analysis
Course materials for Introduction to Computational Literary Analysis, taught at UC Berkeley in Summer 2018, 2019, and 2020, at Columbia University in Fall 2020, and again at UC Berkeley in Summer 2021 and 2022.workshop-text-analysis-spacy
Materials for the workshop Advanced Text Analysis with SpaCy and Scikit-Learn, given at NYU during NYCDH Week 2017, at PyData NYC in Nov. 2017, and at Columbia University in 2018 and 2019.corpus-db
A textual corpus database for the digital humanities.dotfiles
My personal dotfiles, using Nix Flakes to configure my system(s).macro-etym
A tool for analyzing the word histories of a text.gitenberg-experiments
Scripts for scraping metadata from Project Gutenberg books, via GITenberg.corpus-list
A structured list of text corpora, created for use with a corpus downloader.late-style-PCA
An attempt to experimentally test Edward Said's claims about late style using computational text analysis and principal component analysis.allusion-detection
Computational intertextuality detection in Python. Fuzzy string matching, approximate string matching.cenlab
A corpus of English-language novels combining the ~250 novels of the Corpus of English Novels with the Txtlab corpus of English novels.md2mla
A script and accompanying templates to make an MLA-style paper from a markdown file. Requires Pandoc and LaTeX (xetex)..milton-analysis
Text analysis of Paradise Lost and other poems by John Milton.workshop-word-embeddings
Materials for a workshop in word embeddings, for NYC-DH Week, February 2019workshop-dataviz-2017
An Introduction to Text Analysis and Visualization, Art of Data Visualization Week, April 2017, Columbia Universitydissertation
A dissertation in computational literary analysis, called "The Eye of Modernism: Visual Imaginations of British literature, 1880-1930"template-research-paper
A template for a research paper, which compiles to many file formats.template-dissertation
A template for a modern, best-practices dissertation.jonreeve.com
My personal website, jonreeve.com, written in Haskell, using Ema.course-cic-compling
Course materials for the course Computing in Context section in Computational Linguistics. Dept. of Computer Science, Columbia University, Fall 2021. Work-in-progress.course-computational-literary-analysis-readings
Syllabus and course readings for Introduction to Computational Literary Analysis, a course taught at UC-Berkeley in Summer 2018, 2019, and 2020, and at Columbia University in Fall 2020.shakespeare-dialog-extractor
An application to extract dialog from Shakespeare plays, as encoded into TEI by the Folger Library.book-computational-literary-analysis
A textbook for the course, Introduction to Computational Literary Analysis. WIPdocmap
A project for creating new themes and customization functionality for the Omeka content management system.conference-joyce-digital
Website and materials for the conference Joyce in the Digital Age, held at Columbia University on October 1st, 2017.free-indirect-discourse-model
Modeling free indirect discourse in literature, using AI.plato-analysis
Analyses of Platonic dialogues, including a Socratic dialogue generator.course-word-embeddings
Course materials for "Meaningful Text Analysis with Word Embeddings," taught at the Digital Humanities Summer Institute, June 2021.text-to-time-series
Experiments in text analysis, generating time series from texts.occupations-experiment
Experiments in quantifying occupations as they're represented in fiction.template-course-website
A website for a university course. Semantic by default.sops
Research materials (literature review, bibliography) for the project A Safer Online Public Squaredissertation-prospectus
My ever-protean dissertation prospectus.character-attribution
Probabilistic attribution of character voices in fiction.htrc-experiments
Text analysis experiments with Hathi Trust Research Center literary datasets.corpus-SHC
A fork of Martin Mueller's Shakespeare His Contemporaries corpus, originally located at https://github.com/martinmueller39/SHC, divided into submodules as an experiment.html2tei
A tool to extract structured data from novels (starting with Project Gutenberg HTML files)sent2tree
Alternative visualizations for SpaCy-parsed sentences, using ETE3.sentence-trees
Experiments with sentences as trees.pg-srp
Stable Random Projections (SRP) of Project Gutenberg texts, for similarity testscourse-data-ethics
Draft syllabus for a course in data science ethics. WIP.course-nyu-pit
Course materials for the New York University Institute in Public Interest Technology (NYU-PIT)org-autolinks-mode
An emacs minor mode for automatically linking to org files, after typing the name of the file.hs-tei-transform
Experiments in transforming TEI XML, using Haskellworkshop-intro-haskell
An introduction to functional programming in Haskell. A workshop given in October 2020 at Columbia University.david-copperfield
An annotated edition of David Copperfielddataviz-workshop
Materials for a workshop in text analysis and visualization, originally given at Columbia University in April 2016.chaucer-macro-etym
Macro-etymological analyses of the Canterbury Tales.corpus-mansfield-garden-party-TEI
A TEI edition of Katherine Mansfield's short story "The Garden Party."persistent-homology
Experiments with NLP and persistent homology.course-university-writing
Draft materials for the course "University Writing with Readings in the Data Sciences," taught at Columbia University in the fall of 2017. Students, please refer to CourseWorks instead of this repository.course-multilingual-technologies
Course website for Multilingual Technologies and Language Diversity, taught at Columbia University by Prof. Smaranda Muresan and Dr. Isabelle ZauggLove Open Source and this site? Check out how you can help us