• Stars
    star
    176
  • Rank 210,123 (Top 5 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created over 4 years ago
  • Updated almost 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A tutorial on how to implement models for part-of-speech tagging using PyTorch and TorchText.

PyTorch PoS Tagging

Note: This repo only works with torchtext 0.9 or above which requires PyTorch 1.8 or above. If you are using torchtext 0.8 then please use this branch

This repo contains tutorials covering how to perform part-of-speech (PoS) tagging using PyTorch 1.8, torchtext 0.9, and and spaCy 3.0, using Python 3.8.

These tutorials will cover getting started with the most common approach to PoS tagging: recurrent neural networks (RNNs). The first notebook introduces a bi-directional LSTM (BiLSTM) network. The second covers how to fine-tune a pretrained Transformer model.

If you find any mistakes or disagree with any of the explanations, please do not hesitate to submit an issue. I welcome any feedback, positive or negative!

Getting Started

To install PyTorch, see installation instructions on the PyTorch website.

To install TorchText:

pip install torchtext

To install the transformers library:

pip install transformers

We'll also make use of spaCy to tokenize our data. To install spaCy, follow the instructions here making sure to install the English models:

python -m spacy download en_core_web_sm

Tutorials

  • 1 - BiLSTM for PoS TaggingOpen In Colab

    This tutorial covers the workflow of a PoS tagging project with PyTorch and TorchText. We'll introduce the basic TorchText concepts such as: defining how data is processed; using TorchText's datasets and how to use pre-trained embeddings. Using PyTorch we built a strong baseline model: a multi-layer bi-directional LSTM. We also show how the model can be used for inference to tag any input text.

  • 2 - Fine-tuning Pretrained Transformers for PoS TaggingOpen In Colab

    This tutorial covers how to fine-tune a pretrained Transformer model, provided by the transformers library, by integrating it with TorchText. We use a pretrained BERT model to provide the embeddings for our input text and input these embeddings to a linear layer that will predict tags based on these embeddings.

References

Here are some things I looked at while making these tutorials. Some of it may be out of date.

More Repositories

1

pytorch-seq2seq

Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.
Jupyter Notebook
5,139
star
2

pytorch-sentiment-analysis

Tutorials on getting started with PyTorch and TorchText for sentiment analysis.
Jupyter Notebook
4,213
star
3

pytorch-image-classification

Tutorials on how to implement a few key architectures for image classification using PyTorch and TorchVision.
Jupyter Notebook
909
star
4

pytorch-rl

Tutorials for reinforcement learning in PyTorch and Gym by implementing a few of the popular algorithms. [IN PROGRESS]
Jupyter Notebook
251
star
5

a-tour-of-pytorch-optimizers

A tour of different optimization algorithms in PyTorch.
Jupyter Notebook
77
star
6

machine-learning-courses

A collection of machine learning courses.
36
star
7

code2vec

A PyTorch implementation of `code2vec: Learning Distributed Representations of Code` (Alon et al., 2018)
Python
33
star
8

pytorch-generative-models

[IN PROGRESS] An introduction to generative adversarial networks (GANs) and variational autoencoders (VAEs) in PyTorch, by implementing a few key architectures.
Jupyter Notebook
29
star
9

pytorch-nli

A tutorial on how to implement models for natural language inference using PyTorch and TorchText. [IN PROGRESS]
Jupyter Notebook
25
star
10

pytorch-language-modeling

Jupyter Notebook
13
star
11

extreme-summarization-of-source-code

Implementation of 'A Convolutional Attention Network for Extreme Summarization of Source Code' in PyTorch using TorchText
Python
13
star
12

pytorch-text-classification

Jupyter Notebook
13
star
13

notes

Python
12
star
14

gradient-descent

Let's learn gradient descent by using linear regression, logistic regression and neural networks!
Jupyter Notebook
11
star
15

pytorch-neural-style-transfer

Python
11
star
16

pytorch-for-code

Using PyTorch to apply machine learning techniques to source code.
Python
10
star
17

pytorch-transfer-learning

Python
9
star
18

pytorch-practice

Jupyter Notebook
8
star
19

bag-of-tricks-for-efficient-text-classification

Implementation of 'Bag of Tricks for Efficient Text Classification' in PyTorch using TorchText
Python
8
star
20

pytorch-dqn

An implementation of various flavours of deep Q-learning (DQN) in PyTorch.
Jupyter Notebook
7
star
21

recurrent-attention-model

Python
7
star
22

paper-notes

n'th attempt at keeping note of papers I have read
6
star
23

lexisearch

Use semantic similarity models to query transcriptions from the Lex Fridman Podcast.
Python
6
star
24

CodeSearchNet

Python
4
star
25

relation-networks

Implementation of the bAbi task from A simple neural network module for relational reasoning in PyTorch using TorchText.
Python
3
star
26

variational-autoencoders

Jupyter Notebook
3
star
27

snli

https://nlp.stanford.edu/projects/snli/
Python
3
star
28

go-practice

Go
2
star
29

Glucoduino

Project to read data from glucometers using the Arduino platform
C++
2
star
30

bentrevett.github.io

My personal website to act as a portfolio
HTML
2
star
31

character-aware-neural-language-models

Implementation of 'Character-Aware Neural Language Models' in PyTorch using TorchText
Python
2
star
32

attributed-document-qa

Python
2
star
33

wordle-terminal

Wordle in the terminal.
Python
1
star
34

art

Markov chain to generate "art"
Python
1
star
35

sorting-algorithms

Implementation of sorting algorithms, with visualizations.
1
star
36

py-algorithms

Implementation of various algorithms in Python 3.
Jupyter Notebook
1
star
37

keepnote

Google Chrome note taking extension
JavaScript
1
star
38

Glucoduino-Classic-Bluetooth-Application

Android application for glucoduino project using standard Bluetooth
Java
1
star
39

bentrevett

1
star
40

brainfuck-python

A brainfuck interpreter in Python 3.
Brainfuck
1
star
41

numberworld

A toy environment for task-oriented language grounding.
Python
1
star
42

Glucoduino-CSR-Chip

Code for the CSR uEnergy SDK for the glucoduino project
C
1
star