• Stars
    star
    192
  • Rank 202,019 (Top 4 %)
  • Language
    Python
  • License
    MIT License
  • Created about 7 years ago
  • Updated almost 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Linguistic Inquiry and Word Count (LIWC) analyzer

liwc

PyPI version Travis CI Build Status

This repository is a Python package implementing two basic functions:

  1. Loading (parsing) a Linguistic Inquiry and Word Count (LIWC) dictionary from the .dic file format.
  2. Using that dictionary to count category matches on provided texts.

This is not an official LIWC product nor is it in any way affiliated with the LIWC development team or Receptiviti.

Obtaining LIWC

The LIWC lexicon is proprietary, so it is not included in this repository.

The lexicon data can be acquired (purchased) from liwc.net.

  • If you are a researcher at an academic institution, please contact Dr. James W. Pennebaker directly.
  • For commercial use, contact Receptiviti, which is the company that holds exclusive commercial license.

Finally, please do not open an issue in this repository with the intent of subverting encryption implemented by the LIWC developers. If the version of LIWC that you purchased (or otherwise legitimately obtained as a researcher at an academic institution) does not provide a machine-readable *.dic file, please contact the distributor directly.

Setup

Install from PyPI:

pip install liwc

Example

This example reads the LIWC dictionary from a file named LIWC2007_English100131.dic, which looks like this:

%
1   funct
2   pronoun
[...]
%
a   1   10
abdomen*    146 147
about   1   16  17
[...]

Loading the lexicon

import liwc
parse, category_names = liwc.load_token_parser('LIWC2007_English100131.dic')
  • parse is a function from a token of text (a string) to a list of matching LIWC categories (a list of strings)
  • category_names is all LIWC categories in the lexicon (a list of strings)

Analyzing text

import re

def tokenize(text):
    # you may want to use a smarter tokenizer
    for match in re.finditer(r'\w+', text, re.UNICODE):
        yield match.group(0)

gettysburg = '''Four score and seven years ago our fathers brought forth on
  this continent a new nation, conceived in liberty, and dedicated to the
  proposition that all men are created equal. Now we are engaged in a great
  civil war, testing whether that nation, or any nation so conceived and so
  dedicated, can long endure. We are met on a great battlefield of that war.
  We have come to dedicate a portion of that field, as a final resting place
  for those who here gave their lives that that nation might live. It is
  altogether fitting and proper that we should do this.'''.lower()
gettysburg_tokens = tokenize(gettysburg)

Now, count all the categories in all of the tokens, and print the results:

from collections import Counter
gettysburg_counts = Counter(category for token in gettysburg_tokens for category in parse(token))
print(gettysburg_counts)
#=> Counter({'funct': 58, 'pronoun': 18, 'cogmech': 17, ...})

N.B.:

  • The LIWC lexicon only matches lowercase strings, so you will most likely want to lowercase your input text before passing it to parse(...). In the example above, I call .lower() on the entire string, but you could alternatively incorporate that into your tokenization process (e.g., by using spaCy's token.lower_).

License

Copyright (c) 2012-2020 Christopher Brown. MIT Licensed.

More Repositories

1

overdrive

Bash script to download mp3s from the OverDrive audiobook service
Shell
438
star
2

rfc6902

Complete implementation of RFC6902 in TypeScript
TypeScript
283
star
3

unmap

Unpack a JavaScript Source Map back into filesystem structure
JavaScript
172
star
4

macos-pasteboard

Like OS X's built-in pbpaste but more flexible and raw
Swift
92
star
5

slda

Supervised Latent Dirichlet Allocation for Classification
C++
84
star
6

flickr-with-uploads

Flickr API for Node.js using OAuth 1.0a, including upload support and featuring a CLI
JavaScript
47
star
7

osx-notifier

Send notifications to the OS X Notification Center using terminal-notifier.app
JavaScript
41
star
8

macos-wifi

MacOS Wi-Fi (CoreWLAN) utility
Swift
35
star
9

twttr

Twitter API client for Clojure supporting REST and Streaming endpoints
Clojure
35
star
10

lexicons

Lexicons for n-gram sentiment analysis
Python
20
star
11

flickr-sync

Deprecated! Use flickr-with-uploads instead
JavaScript
17
star
12

amulet

As-soon-as-possible streaming asynchronous Mustache template engine for Node.js
JavaScript
16
star
13

aclweb-data

Data from https://aclweb.org/anthology/
HTML
16
star
14

candc

C&C (Clark & Curran) Parser downloads
Python
14
star
15

nlp

NLP Homework (Spring 2013)
Java
13
star
16

twilight

Twitter Streaming API tools and data transformations for Node.js
JavaScript
13
star
17

afm

Collection of AFM (Adobe Font Metrics) specifications
TypeScript
12
star
18

BluetoothLE-HeartRate

Node.js Bluetooth Low Energy (BLE) heart rate (HR) sensor data collector
JavaScript
10
star
19

pdfi

PDF parsing, drawing, and text extraction
TypeScript
9
star
20

acl-anthology-network

Post-processing for "ACL Anthology Network" corpus (aanrelease2013)
Python
9
star
21

dropyll

Use Dropbox to edit your Jekyll website (with staging area)
JavaScript
9
star
22

macos-tags

Command line tool for manipulating OS X filesystem tags
Swift
8
star
23

autoauth

Automatic OAuth token generation from basic user account credentials
JavaScript
8
star
24

iOSpy

iOS MobileSync backup data extraction
Python
8
star
25

unidata

Javascript interface to the Unicode Character Database
JavaScript
8
star
26

chrome-unxss

Chrome extension to modify website headers on the fly
JavaScript
8
star
27

audible

Audio extraction and chapter splitting from Audible audiobooks
Shell
7
star
28

scripts

Multi-use scripts for my PATH
Python
7
star
29

fs-change

Monitor changes to specified files or directories, run arbitrary scripts in response
JavaScript
7
star
30

turk

Amazon Mechanical Turk API
TypeScript
7
star
31

config

My preferred system preferences
Shell
7
star
32

socks-server

SOCKS4/SOCKS5 proxy server
JavaScript
6
star
33

refseer

RefSeer dataset downloader
Makefile
6
star
34

openxml

openxml is a Python library to create and manipulate .docx and .pptx files
Python
6
star
35

fancy-clojure

Fancy printing — prettier than pretty
Clojure
6
star
36

pybtex

Fork of http://pybtex.sourceforge.net/
Python
6
star
37

presidents

Textual data (and scrapers) produced by the United States presidency
Jupyter Notebook
5
star
38

brew-tour

Web UI and summarizer to facilitate pruning Homebrew-installed packages
JavaScript
5
star
39

argv

Simpler command line argument parsing in Python
Python
5
star
40

tex

TeX (and BibTeX) for JavaScript!
TypeScript
5
star
41

stanford-parser

Stanford parser with sane logging
Java
5
star
42

viz

Python-powered terminal visualizations
Python
4
star
43

synology

Synology configuration notes
Shell
4
star
44

jsed

JavaScript stream editor: transform JSON via the command line
JavaScript
4
star
45

formious

Online experimentation framework (Mechanical Turk oriented)
JavaScript
4
star
46

pi

Simpler python package installation
Python
4
star
47

justext

UNMAINTAINED; use https://github.com/miso-belica/jusText instead
Python
4
star
48

bartleby

BibTeX (and TeX) parsing with Clojure
Clojure
4
star
49

sqlcmd

Incremental and immutable SQL command builder
TypeScript
4
star
50

to-sql

Read tabular data (Excel, csv, tsv) into PostgreSQL
TypeScript
4
star
51

osx-tap

Mac OS X key logger
Objective-C
3
star
52

booktool

eBook (EPUB and Audiobook) management tool
Python
3
star
53

textarea

HTML Textarea element enhancements (vanilla JavaScript; no dependencies)
TypeScript
3
star
54

aclweb

Make-driven ACL anthology downloader
JavaScript
3
star
55

BluetoothLE-Explorer

Command line explorer for Bluetooth Low Energy (BLE) devices
JavaScript
3
star
56

domlike

A better DomHandler for fb55's htmlparser2
TypeScript
3
star
57

disqust-python

Disqus API client
Python
3
star
58

xdoc-python

Python DOCX parsing. You should use xdoc instead: https://github.com/chbrown/xdoc
Python
3
star
59

xdoc

Document object manipulation
TypeScript
2
star
60

iTunesMeta

Tools for manipulating iTunes in Python
Python
2
star
61

kdd-2013-usb

Contents of the KDD 2013 USB drive
CSS
2
star
62

macos-notification

Generate plain notifications on Mac OS X from the command line
Swift
2
star
63

lexing

Regex-based lexer
TypeScript
2
star
64

ansible-wordpress

Ansible playbook for installing WordPress on a Digital Ocean droplet
PHP
2
star
65

npm-search-server

NPM registry ElasticSearch API with download counts
TypeScript
2
star
66

regex-router

Route http requests via regular expressions
TypeScript
2
star
67

jsonarea

React component for editing/validating JSON (as text)
TypeScript
2
star
68

plist-utils

Tools for manipulating files/streams in Apple's "property list" format
Shell
2
star
69

pbwatch

Polling the Mac OS X pasteboard with Python
Python
2
star
70

macos-location

Logger daemon for monitoring your macOS's location via CoreLocation updates
Swift
2
star
71

github-corpora

Tools for crawling the GitHub API and data pulled from the public API
Python
2
star
72

chicken

Port of 'Chicken of the VNC' from SourceForge
Shell
2
star
73

topic-sentiment-authorship

Topic-Sentiment Authorship
Jupyter Notebook
2
star
74

routes-clojure

URL path parsing and generation via routes data structures
Clojure
2
star
75

libpam-storepw

PAM module to store password
C
2
star
76

filesequence

Write to an indexed sequence of files using the standard Python file API
Python
2
star
77

streaming

Common stream.Transform implementations and other Node.js streaming helpers
TypeScript
2
star
78

divvy-history

Historical data for http://divvybikes.com/stations/json
Python
1
star
79

xmltree

DOM-driven tools for XML viewing
TypeScript
1
star
80

walk-stream

Recursive filesystem walker for Node.js that implements stream.Readable
JavaScript
1
star
81

misc-js

Custom client-side JavaScript libraries for use with jQuery, Backbone, and Handlebars
JavaScript
1
star
82

cameo-twitter

Small-scale Twitter crawling and archiving
JavaScript
1
star
83

set

Javascript to display set cards through HTML5 canvas
JavaScript
1
star
84

appfog-mongo-bottle

Minimum working example of connecting MongoDB and Bottle on Appfog
Python
1
star
85

photos-python

Photo management utilities implemented in Python
Python
1
star
86

npm-reallink

Publish-like 'npm link' replacement (for TypeScript development)
JavaScript
1
star
87

citation-analysis

Citation analysis tools
Python
1
star
88

npm-ui

Web UI companion to 'npm-search-server'
JavaScript
1
star
89

notify-ui

Client-side helper for displaying flash messages
TypeScript
1
star
90

tarry

Utility library for manipulating JavaScript Arrays
TypeScript
1
star
91

sv

Any-separated values
TypeScript
1
star
92

hackpad

Hackpad API client with CLI for archiving
JavaScript
1
star
93

marked-cli

Alternate CLI for Markdown interpreter, 'marked'
JavaScript
1
star
94

taskdb

Database + REST API server for managing (storing / allocating) annotation tasks
TypeScript
1
star
95

confrm

Conference Resource Management
JavaScript
1
star
96

ritual

Database and API for enhancing shell history and clipboard processing
TypeScript
1
star
97

yaml-utils

CLI tools for converting between JSON and YAML
JavaScript
1
star
98

dbml

Homework for Dana Ballard's Machine Learning course
MATLAB
1
star
99

xmlconv

XML conversion by convention
JavaScript
1
star
100

wiktionary

Tools for working with Wiktionary data
Python
1
star