• Stars
    star
    668
  • Rank 67,232 (Top 2 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created over 7 years ago
  • Updated almost 6 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

CS224n: Natural Language Processing with Deep Learning Assignments Winter, 2017

CS224n

CS224n: Natural Language Processing with Deep Learning Assignments Winter, 2017

Requirements

  • Python 2.7
  • TensorFlow r1.2

Assignment #1

  1. Softmax
  2. Neural Network Basics
  3. word2vec q3_word_vectors
  4. Sentiment Analysis q4_reg_v_acc q4_dev_conf

Assignment #2

  1. Tensorflow Softmax
  2. Neural Transition-Based Dependency Parsing
924/924 [==============================] - 49s - train loss: 0.0631    
Evaluating on dev set - dev UAS: 88.54
New best dev UAS! Saving model in ./data/weights/parser.weights
================================================================================
TESTING
================================================================================
Restoring the best model weights found on the dev set
Final evaluation on test set - test UAS: 88.92
Writing predictions
Done!
  1. Recurrent Neural Networks: Language Modeling unrolled_rnn

Assignment #3

  1. A window into NER
DEBUG:Token-level confusion matrix:
go\gu   PER     ORG     LOC     MISC    O    
PER     2968    26      84      16      55   
ORG     147     1621    131     65      128  
LOC     48      88      1896    26      36   
MISC    37      40      54      1030    107  
O       42      46      18      39      42614
DEBUG:Token-level scores:
label   acc     prec    rec     f1   
PER     0.99    0.92    0.94    0.93 
ORG     0.99    0.89    0.77    0.83 
LOC     0.99    0.87    0.91    0.89 
MISC    0.99    0.88    0.81    0.84 
O       0.99    0.99    1.00    0.99 
micro   0.99    0.98    0.98    0.98 
macro   0.99    0.91    0.89    0.90 
not-O   0.99    0.89    0.87    0.88 
INFO:Entity level P/R/F1: 0.82/0.85/0.84
  1. Recurrent neural nets for NER
DEBUG:Token-level confusion matrix:
go\gu   PER     ORG     LOC     MISC    O    
PER     2987    32      47      12      71   
ORG     136     1684    90      70      112  
LOC     39      83      1907    21      44   
MISC    43      45      47      1031    102  
O       36      56      15      34      42618
DEBUG:Token-level scores:
label   acc     prec    rec     f1   
PER     0.99    0.92    0.95    0.93 
ORG     0.99    0.89    0.80    0.84 
LOC     0.99    0.91    0.91    0.91 
MISC    0.99    0.88    0.81    0.85 
O       0.99    0.99    1.00    0.99 
micro   0.99    0.98    0.98    0.98 
macro   0.99    0.92    0.89    0.91 
not-O   0.99    0.90    0.88    0.89 
INFO:Entity level P/R/F1: 0.85/0.86/0.85
  1. Grooving with GRUs

q3-noclip-rnn q3-clip-rnn q3-noclip-gru q3-clip-gru

DEBUG:Token-level confusion matrix:
go\gu	PER  	ORG  	LOC  	MISC 	O    
PER  	2920 	41   	57   	12   	119  
ORG  	101  	1716 	73   	64   	138  
LOC  	22   	95   	1908 	16   	53   
MISC 	37   	45   	53   	1017 	116  
O    	21   	67   	14   	39   	42618

DEBUG:Token-level scores:
label	acc  	prec 	rec  	f1   
PER  	0.99 	0.94 	0.93 	0.93 
ORG  	0.99 	0.87 	0.82 	0.85 
LOC  	0.99 	0.91 	0.91 	0.91 
MISC 	0.99 	0.89 	0.80 	0.84 
O    	0.99 	0.99 	1.00 	0.99 
micro	0.99 	0.98 	0.98 	0.98 
macro	0.99 	0.92 	0.89 	0.90 
not-O	0.99 	0.91 	0.88 	0.89 

INFO:Entity level P/R/F1: 0.86/0.85/0.85
  1. Easter Egg Hunt!
    • Run python q3_gru.py dynamics to unfold your candy eggs

References

CS224n official website

Many code snippets come from

More Repositories

1

HanLP

中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
Python
33,040
star
2

pyhanlp

中文分词
Python
3,091
star
3

AhoCorasickDoubleArrayTrie

An extremely fast implementation of Aho Corasick algorithm based on Double Array Trie.
Java
937
star
4

Viterbi

An implementation of HMM-Viterbi Algorithm 通用的维特比算法实现
Java
368
star
5

multi-criteria-cws

Simple Solution for Multi-Criteria Chinese Word Segmentation
Python
299
star
6

hanlp-lucene-plugin

HanLP中文分词Lucene插件,支持包括Solr在内的基于Lucene的系统
Java
295
star
7

TextRank

TextRank算法提取关键词的Java实现
Java
199
star
8

LDA4j

A Java implemention of LDA(Latent Dirichlet Allocation)
Java
194
star
9

TreebankPreprocessing

Python scripts preprocessing Penn Treebank and Chinese Treebank
Python
162
star
10

ID-CNN-CWS

Source codes and corpora of paper "Iterated Dilated Convolutions for Chinese Word Segmentation"
Python
136
star
11

MainPartExtractor

主谓宾提取器的Java实现(对斯坦福的代码失去兴趣,不再维护)
Java
135
star
12

neural_net

反向传播神经网络及应用
Python
82
star
13

udacity-deep-learning

Assignments for Udacity Deep Learning class with TensorFlow in PURE Python, not IPython Notebook
Python
66
star
14

AveragedPerceptronPython

Clone of "A Good Part-of-Speech Tagger in about 200 Lines of Python" by Matthew Honnibal
Python
49
star
15

text-classification-svm

The missing SVM-based text classification module implementing HanLP's interface
Java
47
star
16

MaxEnt

这是一个最大熵的简明Java实现,提供提供训练与预测接口。训练算法采用GIS训练算法,附带示例训练集和一个天气预测的Demo。
Java
45
star
17

IceNAT

IceNAT
Java
32
star
18

BERT-token-level-embedding

Generate BERT token level embedding without pain
Python
28
star
19

sub-character-cws

Sub-Character Representation Learning
Python
25
star
20

HanLPAndroidDemo

HanLP Android Demo
Java
21
star
21

maxent_iis

最大熵-IIS(Improved Iterative Scaling)训练算法的Java实现
Java
18
star
22

gohanlp

Golang RESTful Client for HanLP
Go
13
star
23

DeepBiaffineParserMXNet

An experimental implementation of biaffine parser using MXNet
Python
10
star
24

iparser

Yet another dependency parser, integrated with tokenizer, tagger and visualization tool.
Python
10
star
25

OpenCC-to-HanLP

无损转换OpenCC词典为HanLP格式
Python
9
star
26

tmsvm

Python
1
star
27

bolt_splits

Split Broad Operational Language Translation corpus into train/dev/test set
Python
1
star