• Stars
    star
    236
  • Rank 169,539 (Top 4 %)
  • Language
    Python
  • License
    Other
  • Created over 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Sentence Classifications with Neural Networks

Sentence Classification

The goal of this project is to classify sentences, based on type:

  • Statement (Declarative Sentence)
  • Question (Interrogative Sentence)
  • Exclamation (Exclamatory Sentence)
  • Command (Imperative Sentence)

Each of the above broad sentence categories can be expanded and can be made more indepth. The way these networks and scripts are designed it should be possible expand to classify other sentence types, provided the data is provided.

This was developed for applications at Metacortex and is accompanied by a guide on building practical/applied neural networks on austingwalters.com.

Please, feel free to add PRs to update, improve, and use freely!


To Install

  • Install CUDA and CuDNN if you have a GPU (on your system of choice)
  • Install requirements (on python 3, python 2.x will not work)
pip3 install -r requirements.txt --user

To execute:

Pretrained model:

python3 sentence_cnn_save.py models/cnn

To build your own model:

python3 sentence_cnn_save.py models/<model name>

The models/ will load any pretrained model with said name, or retrain it.

See supplemental material for full guide.

Supplemental Material

This repository was created in conjunction with a guide titled Neural Networks to Production, From an Engineer.

Below is the guides table of contents:

Additional, (more complex models) are available in the advanced_modeling directory. Eventually, posts should come out of them.


Dataset

The dataset is created from parsing out the SQuAD dataset and combining it with the SPAADIA dataset.

The samples in the dataset:

  • Command 1111
  • Statement 80167
  • Question 131001

Note: Questions in this case are only one sentence, statements are a single sentence or more. They are classified correctly, but don't include sentences prior to questions.

Results

With the above, we are able to get the following accuracy:

Model Accuracy Train Speed Classification Speed
Dict 85% Fastest Fastest
CNN 97.80% Fast (185 μs/step) Very Fast (35 μs/step)
CNN (2-layer) 99.33% Fast (210 μs/step) Very Fast (42 μs/step)
MLP 95.5% Very Fast (60 μs/step) Very Fast (42 μs/step)
FastText (1-gram) 94.40% Fast (83 μs/step) Very Fast (26 μs/step)
FastText (2-gram) 95.59% Fast (196 μs/step) Very Fast (26 μs/step)
RNN (LSTM) 98.49% Very Slow (7000 μs/step) Very Slow (1000 μs/step)
RNN (GRU) 99.73% Very Slow (2000 μs/step) Very Slow (1000 μs/step)
CNN + LSTM 99.55% Very Slow (3000 μs/step) Very Slow (722 μs/step)
CNN + GRU 99.82% Very Slow (2000 μs/step) Very Slow (591 μs/step)
CNN + MLP 99.75% Slow (1000 μs/step) Fast (97 μs/step)

With some hyperparameter tuning:

Model Accuracy Train Speed Classification Speed
Dict 85% Fastest Fastest
CNN 99.40% Fast (200 μs/step) Very Fast (26 μs/step)
CNN (2-layer) 99.33% Fast (210 μs/step) Very Fast (42 μs/step)
MLP 95.5% Very Fast (60 μs/step) Very Fast (42 μs/step)
FastText (1-gram) 94.40% Fast (117 μs/step) Very Fast (26 μs/step)
FastText (2-gram) 95.59% Fast (196 μs/step) Very Fast (26 μs/step)
RNN (LSTM) 98.49% Very Slow (7000 μs/step) Very Slow (1000 μs/step)
RNN (GRU) 99.73% Very Slow (2000 μs/step) Very Slow (1000 μs/step)
CNN + LSTM 99.55% Very Slow (3000 μs/step) Very Slow (722 μs/step)
CNN + GRU 99.82% Very Slow (2000 μs/step) Very Slow (340 μs/step)
CNN + MLP 99.75% Slow (1000 μs/step) Fast (97 μs/step)

Computer Configuration:

  • GTX 1080
  • 32 Gb RAM
  • 8x 3.6 Ghz cores (AMD)
  • Arch Linux, up to date on 12/16/2018

CNN Hyperparameter tuning

Accuracy Speed Batch Size Embedding Dims Filters Kernel Hidden Dims Epochs
99.40% 26 μs/step 64 75 100 5 350 7
99.36% 40 μs/step 64 50 250 10 150 5
99.33% 25 μs/step 64 75 75 5 350 5
99.31% 59 μs/step 64 100 350 5 300 3
99.29% 25 μs/step 64 50 100 7 350 5
99.27% 62 μs/step 32 75 350 5 250 3
99.25% 25 μs/step 64 75 100 3 350 5
99.25% 25 μs/step 64 50 100 7 250 3
99.24% 53 μs/step 64 75 350 10 250 3
99.23% 56 μs/step 64 75 350 10 200 3
99.18% 36 μs/step 64 50 250 5 300 5
99.12% 52 μs/step 64 75 350 5 250 3
99.11% 22 μs/step 64 50 75 5 300 4
99.11% 26 μs/step 64 50 100 10 250 3
99.04% 62 μs/step 32 75 350 5 350 3
99.00% 24 μs/step 64 100 50 5 350 3
99.00% 52 μs/step 64 75 350 5 350 3
99.00% 40 μs/step 64 75 250 5 350 3
98.84% 50 μs/step 64 50 350 10 150 3
98.86% 40 μs/step 64 75 250 5 250 3
98.79% 26 μs/step 64 50 100 10 150 3
98.76% 30 μs/step 128 50 200 3 150 3
98.66% 31 μs/step 64 50 150 10 150 3
98.62% 45 μs/step 128 100 350 3 250 3
98.17% 19 μs/step 64 75 50 3 350 6
98.07% 34 μs/step 128 75 250 5 250 3
98.06% 45 μs/step 64 75 350 3 250 3
97.53% 35 μs/step 128 75 250 5 350 3
96.10% 32 μs/step 128 75 250 3 350 3

More Repositories

1

IPC-examples

IPC Examples
C
54
star
2

Smooth-Facial-Tracking

Example of smooth facial tracking using OpenCV 3.0
C++
49
star
3

Email_Analysis

What lies in your email data?
Python
43
star
4

ElevatorAllocation

Elevator Allocation Algorithm for austingwalters.com
Python
33
star
5

curl_command_to_file_example

cURL command to c file
C++
22
star
6

parse-uspto-xml

Python
21
star
7

AnyCrypt

A chrome extension that enables automatic encrypting and decryption of GPG messages over the web
JavaScript
18
star
8

chromatag

C
18
star
9

OpenBKZ

Open source, eBook reader + stats gatherer
C++
14
star
10

io_multiplexing

I/O Multiplexing for Linux and OSX (BSD)
C
14
star
11

Selenium-Factory

A GUI to for the casual Selenium consumer
C++
12
star
12

vocalvoters.com

Contact your Government Representative Send a Letter or Fax in <30 Seconds
HTML
12
star
13

Edge-Detection

Using OpenCV and MatLab for edge detection in the Lab colorspace
C++
12
star
14

graph

Go
9
star
15

basicbookreader

C++
7
star
16

iterative_matrix

Iterative solutions to system of linear equations
Python
6
star
17

gomergesort

Merge Sort written in Golang
Go
6
star
18

Cache-Comparison

C++
5
star
19

AKAZE_ORB_planar_tracking

C++
5
star
20

radixSort

C
5
star
21

pancakesort

Pancake Sorting, The Tastiest Sorting
Python
5
star
22

semapore

Example semaphore program for my website austingwalters.com
C++
4
star
23

opencv3-examples

Opencv 3 examples
C++
4
star
24

emomusic

Emo Player - A music player that reads your mood
C++
3
star
25

ProtonMail-Theme

CSS Theme For ProtonMail
CSS
3
star
26

lettergrams-apriltags

C
3
star
27

PCA

Python
3
star
28

lawnmower_problem

Optimize your lawn mowing
2
star
29

RGB2Lab

C++
2
star
30

covid19-analysis

Python
2
star
31

blognet

D
2
star
32

linear_regression

Python
2
star
33

traversals

various tree traversals
Python
2
star
34

SVD-Linear-Regression

Linear Regression Line obtained via SVD
Python
1
star
35

powerit

1
star
36

orb_realtime_tracking

C++
1
star
37

Quadrature

Python
1
star
38

CostOfWar

The Hidden Opportunity Cost of War
Python
1
star
39

remotiv

A remote program to track emotions while reading
1
star
40

EEGLogger

Logs EEG data, example extrapolated from Emotiv EEGLogger example
C++
1
star
41

imagePainting

Converts an image into a painting
C++
1
star
42

namePlateLife

1
star
43

newtonfractals

Python
1
star
44

ChromaTagDetector

Tag detection for chroma tags
C++
1
star
45

obtaining_ordering_of_cards

Fun Little Puzzle
Python
1
star
46

firearm-analysis

Python
1
star
47

piglet_blog_examples

Examples from https://blog.projectpiglet.com
Python
1
star
48

object-counting

1
star
49

roadtrip

Python
1
star
50

Probability-and-Statistics-for-Computer-Scientists-Ch2

Python
1
star