• Stars
    star
    560
  • Rank 76,747 (Top 2 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created about 6 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

NLP in Python with Deep Learning

Natural Language Processing Notebooks

Available as a Book: NLP in Python - Quickstart Guide

Written for Practicing Engineers

This work builds on the outstanding work which exists on Natural Language Processing. These range from classics like Jurafsky's Speech and Language Processing to rather modern work in The Deep Learning Book by Ian Goodfellow et al.

While they are great as introductory textbooks for college students - this is intended for practitioners to quickly read, skim, select what is useful and then proceed. There are several notebooks divided into 7 logical themes.

Each section builds on ideas and code from previous notebooks, but you can fill in the gaps mentally and jump directly to what interests you.

Chapter 01

Introduction To Text Processing, with Text Classification

  • Perfect for Getting Started! We learn better with code-first approaches

Chapter 02

  • Text Cleaning notebook, code-first approaches with supporting explanation. Covers some simple ideas like:
    • Stop words removal
    • Lemmatization
  • Spell Correction covers almost everything that you will ever need to get started with spell correction, similar words problems and so on

Chapter 03

Leveraging Linguistics is an important toolkit in any practitioners toolkit. Using spaCy and textacy we look at two interesting challenges and how to tackle them:

  • Redacting names
    • Named Entity Recognition
  • Question and Answer Generation
    • Part of Speech Tagging
    • Dependency Parsing

Chapter 04

Text Representations is about converting text to numerical representations aka vectors

  • Covers popular celebrities: word2vec, fasttext and doc2vec - document similarity using the same
  • Programmer's Guide to gensim

Chapter 05

Modern Methods for Text Classification is simple, exploratory and talks about:

  • Simple Classifiers and How to Optimize Them from scikit-learn
  • How to combine and ensemble them for increased performance
  • Builds intuition for ensembling - so that you can write your own ensembling techniques

Chapter 06

Deep Learning for NLP is less about fancy data modeling, and more engineering for Deep Learning

  • From scratch code tutorial with Text Classification as an example
  • Using PyTorch and torchtext
  • Write our own data loaders, pre-processing, training loop and other utilities

Chapter 07

Building your own Chatbot from scratch in 30 minutes. We use this to explore unsupervised learning and put together several of the ideas we have already seen.

  • simpler, direct problem formulation instead of complicated chatbot tutorials commonly seen
  • intents, responses and templates in chat bot parlance
  • hacking word based similarity engine to work with little to no training samples

More Repositories

1

awesome-project-ideas

Curated list of Machine Learning, NLP, Vision, Recommender Systems Project Ideas
7,470
star
2

best-of-jupyter

Jupyter Tips, Tricks, Best Practices with Sample Code for Productivity Boost
420
star
3

hindi2vec

State-of-the-Art Language Modeling and Text Classification in Hindi Language
Jupyter Notebook
220
star
4

pytorch-web-deploy

Simple, fast web deployment for your PyTorch models
Python
70
star
5

agentai

Text to Python Objects via a LLM Function Call
Python
54
star
6

coronaIndia

Experiments & NLP Deployments for CoronaVirus Related Work
Jupyter Notebook
34
star
7

Hinglish

Hinglish Text Classification
Jupyter Notebook
30
star
8

breakoutlist-india

High potential opportunities for ambitious engineers, designers, data people and future founders. The best teams to join.
27
star
9

llama2demo

Python
14
star
10

Twitter-Geographical-Sentiment-Analysis

Finds the Happiest US and Indian State based on Sentimental Analysis of Twitter Data
Python
13
star
11

keras-practice

Notebooks covering Intro to CNN, Transfer Learning using VGG16
Jupyter Notebook
12
star
12

Genetic-Algorithm-Self-Study-Notes

Notes, Reading Sources and Bibliography on Genetic Algorithms
8
star
13

qdrant_tools

Python Tools to use with the Qdrant Python Client
Jupyter Notebook
7
star
14

nirantk.github.io

Jupyter Notebook
6
star
15

Text-Summarization

C
4
star
16

awesome-vectordb

Everything you need to decide and work with VectorDBs
Python
4
star
17

knee-xrays

Exploratory Repository
Jupyter Notebook
3
star
18

fitz-wrapper

CLI Utilities for PDF to Image Conversion, built with Py3
Python
3
star
19

OnDeckMLChallenge

Jupyter Notebook
3
star
20

fastvector

Python
3
star
21

DSA-BITS-Masti

Data Structures and Algorithms at BITS Pilani
C
3
star
22

experiments

Repository for Experimental Code
HTML
2
star
23

quickstart

Shell
2
star
24

comehomeandbuild

HTML
2
star
25

MITx-Analytics-Edge-Coursework

Code, Lecture Slides and Data from edx.org/course/analytics-edge-mitx-15-071x-0
HTML
2
star
26

cohere-learn

Utils which wrap around Cohere API: FewShotClassify and more coming soon
Python
1
star
27

Noor

Bringing Light to What We are Taught :)
HTML
1
star
28

interview_practice

Archive
C++
1
star
29

Aditi

1
star
30

latest-news-ncert

Link educational topics to latest NEWS
Python
1
star
31

julie

Julie is a blogging assistant and linter for AI Hackers wanting to make their work more accessible
Python
1
star
32

qdrant-course

Jupyter Notebook
1
star
33

CovidSeer

Complimentary Repo for Publishing Public facing Covid India work
Jupyter Notebook
1
star
34

go-demo

Demo code for the Golang lecture by @theonewolf
Go
1
star
35

bq

Binary Quantization in Numpy
1
star