• Stars
    star
    111
  • Rank 314,510 (Top 7 %)
  • Language
    Jupyter Notebook
  • License
    Apache License 2.0
  • Created about 8 years ago
  • Updated 7 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Python tutorials as Jupyter Notebooks for NLP, ML, AI

Python Tutorials for NLP, ML, AI

(C) 2016-2024 by Damir Cavar

NLP-Lab at Indiana University.

Notebooks

NLTK Notebooks

spaCy Notebooks

See the licensing details on the individual documents and in the LICENSE file in the code folder.

Introduction

The files in this folder are Jupyter-based tutorials for NLP, ML, AI in Python for classes I teach in Computational Linguistics, Natural Language Processing (NLP), Machine Learning (ML), and Artificial Intelligence (AI) at Indiana University.

If you find this material useful, please cite the author and source (that is Damir Cavar and all the sources cited in the relevant notebooks). Please let me know if you have some suggestions on how to correct the notebooks, improve them, or add some material and explanations.

The instructions below are somewhat outdated. I use just Jupyter-Lab now. Follow the instructions here to set it up on different machine types and operating systems.

To run this material in Jupyter you need to have Python 3.x and Jupyter installed. You can save yourself some trouble by using the Anaconda Python 3.x distribution.

Clone the project folder using:

git clone https://github.com/dcavar/python-tutorial-for-ipython.git

Some of the notebooks may contain code that requires various kinds of [Python] modules to be installed in specific versions. Some of the installations might be complicated and problematic. I am working on a more detailed description of installation procedures and dependencies for each notebook. Stay tuned, this is coming soon.

Installing Jupyter

Jupyter is a great tool for computational publications, tutorials, and exercises. I set up my favorite components for Jupyter on Linux (for example Ubuntu) this way:

Assuming that I have some of the development tools installed, as for example gcc, make, etc., I install the packages python3-pip and python3-dev:

sudo apt install python3-pip python3-dev

After that I update the global system version of pip to the newest version:

sudo -H pip3 install -U pip

Then I install the newest Jupyter and Jupyterlab modules globally, updating any previously installed version:

sudo -H pip3 install -U jupyter jupyterlab

The module that we should not forget is plotly:

sudo -H pip3 install -U plotly

Scala, Clojure, and Groovy are extremely interesting languages as well, and I love working with Apache Spark, thus I install BeakerX as well. This requires two other [Python] modules: py4j and pandas. This presupposes that there is an existing Java JDK version 8 or newer already installed on the system. I install all the BeakerX related packages:

sudo -H pip3 install -U py4j
sudo -H pip3 install -U pandas
sudo -H pip3 install -U beakerx

To configure and install all BeakerX components I run:

sudo -H beakerx install

Some of the components I like to use require Node.js. On Ubuntu I usually add the newest Node.js as a PPA and not via Ubuntu Snap. Some instructions how to achieve that can be found here. To install Node.js on Ubuntu simply run:

sudo apt install nodejs

The following commands will add plugins and extensions to Jupyter globally:

sudo -H jupyter labextension install @jupyter-widgets/jupyterlab-manager
sudo -H jupyter labextension install @jupyterlab/plotly-extension
sudo -H jupyter labextension install beakerx-jupyterlab

Another useful package is Voilà, which allows you to turn Jupyter notebooks into standalone web applications. I install it using:

sudo -H pip3 install voila

Now the initial version of the platform is ready to go.

To start the Jupyter notebook viewer/editor on your local machine change into the notebooks folder within the cloned project folder and run the following command:

jupyter notebook

A browser window should open up that allows you full access to the notebooks.

Alternatively, check out the instructions how to launch JupyterLab, BeakerX, etc.

Enjoy!

Damir

More Repositories

1

nlp-lab.github.io

NLP Lab Website
HTML
20
star
2

spaCy-JSON-NLP

spaCy wrapper for JSON-NLP.
Python
12
star
3

Flair-JSON-NLP

Flair wrapper for JSON-NLP.
Python
11
star
4

ELAN2split

Split ELAN Annotation Files and corresponding speech files into a corpus format for common ASR and Forced Aligners
C++
10
star
5

speechsignal

Speech signal processing, supra-segmental phonology, prosody
CSS
10
star
6

q

Quantum Computing and Algorithms
Jupyter Notebook
8
star
7

Py-JSON-NLP

Python module for JSON-NLP
Python
8
star
8

fomaMWT

Foma-based multi-word tagger and morphological analyzer
C++
7
star
9

PyFST

Python 3 Finite State Weighted Transducer Library
Python
7
star
10

schemeNLP

Scheme code for computational linguistics, natural language processing, corpus analysis taught at ESSLLI long time ago
Scheme
7
star
11

SNLTK

Scheme Natural Language Toolkit (www.snltk.org) files with examples and teaching material.
Scheme
6
star
12

fomaJNI

A Java JNI interface for Foma (a Finite State Transducer compiler for NLP)
C++
6
star
13

NLTK-JSON-NLP

NLTK wrapper to JSON-NLP.
Python
5
star
14

AntisemitismDatathon2020

This is project material for the Antisemitism Datathon and Hackathon 2020 at Indiana University
5
star
15

ESSLLI24_LLM_KG.github.io

ESSLLI 2024 Course on Large Language Models, Knowledge, and Reasoning - Generative AI and Symbolic Knowledge Representations
Jupyter Notebook
5
star
16

sociallenseonline.github.io

social-lense.online website.
4
star
17

Polyglot-JSON-NLP

Polyglot wrapper for JSON-NLP.
Python
4
star
18

juliaFoma

Julia NLP with Foma: Finite State Transducer for Morphological Analysis
Julia
4
star
19

Xrenner-JSON-NLP

Xrenner wrapper for JSON-NLP.
Python
4
star
20

fle

Free Linguistic Environment
C++
4
star
21

Mueller_Report_NLP_Analysis

This repo contains the data and results of our NLP-based Mueller Report Analysis, Knowledge Graphs, entitites, relations, and so on.
4
star
22

Py3L

Python 3 for Linguists. Class material, example code, scripts and example data.
Graphviz (DOT)
4
star
23

GeoLing

GeoLing: GIS app for mailing list announcements via LINGUIST List
JavaScript
4
star
24

TreebankParser

Parser for treebanks based on Penn Treebank type of encoding that generates Probabilistic Context Free Grammars
C
3
star
25

PG2TEI

Project Gutenberg books to TEI XML conversion.
3
star
26

dcavar.github.io

Cavar's homepage
HTML
3
star
27

fomaTestCPP

Foma-based morphological analysis using a simple C++ wrapper
C++
2
star
28

alexa_https

Alexa HTTPS server interface in Go
Go
2
star
29

tieml

Temporal Information and Event Markup Language
Jupyter Notebook
2
star
30

J-JSON-NLP

Java JSON-NLP Maven module and validator
Java
2
star
31

LLMap

LL-MAP: Language and Location - Map Accessibility Project
2
star
32

FomaExamples

Morphologies implemented using Foma
Python
2
star
33

go_tutorial_jupyter

Go tutorial jupyter notebooks.
Jupyter Notebook
2
star
34

PyEdu

Python example code, scripts, tools for "python for linguists" classes, including some slides and tutorials.
2
star
35

maildir2mbox.py

Python implementation of a script for the conversion of most recent Evolution maildir folders to Thunderbird mbox files.
Python
2
star
36

TreeProcessor

Converter for bracketed annotation syntax trees, generating a PCFG, dominance relations, scope, c-command
Java
2
star
37

JavaEdu

Java code examples and tools for "Java for linguists", including some slides, tutorials and other docu.
1
star
38

larg

Linear Algebra Reading Group
Jupyter Notebook
1
star
39

elinguistics.github.io

HTML
1
star
40

spacyxmlrpc

spaCy XML-RPC example
Python
1
star
41

MultiTree

MultiTree: A Digital Library of Language Relationships
1
star
42

KnowledgeGraphAI

1
star
43

cavar.github.io

Cavar's homepage
HTML
1
star
44

mongoDBJava

Example implementation for connecting MongoDB with Java
Java
1
star
45

LID

Language Identification in Python, Java and C(++)
1
star
46

prolog-tutorial

This is a basic Prolog tutorial repo for the dicussion group at IU
1
star
47

thec_eng

The Hoosier Ellipsis Corpus (THEC) - English Sub-corpus (thec_eng)
Jupyter Notebook
1
star
48

elinguist.github.io

HTML
1
star
49

julia_NLP_Notebooks

Julia NLP Notebooks
Jupyter Notebook
1
star
50

TorchCPP1

LibTorch first example based on the online tutorial with functioning CMake configuration
CMake
1
star
51

quantum-ai-nlp

Site and information collection related to Quantum Computation related to AI and Natural Language Processing
Jupyter Notebook
1
star
52

rust-tutorial-notebooks

Rust tutorials for NLP and AI
Jupyter Notebook
1
star
53

hoosierellipsiscorpus

The Hoosier Ellipsis Corpus
1
star