• Stars
    star
    193
  • Rank 201,081 (Top 4 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 6 years ago
  • Updated almost 6 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

chinese and english corpus process script, python, c++, java

Introduction

这里将会有中英文数据处理脚本,编程语言不限,会有详细的README说明。

Script Lists

  1. 中文繁体转简体
  2. 维基百科数据处理
  3. 抽取单字特征
  4. 抽取双字特征
  5. 抽取汉字笔画信息
  6. 去除非中文字符
  7. 中文Money转换数字Money
  8. 全半角转换
  9. python2代码批量转换python3
  10. NER标签转换(BIO, BMESO)

Question

  • if you have any question, you can open a issue or email bamtercelboo@{gmail.com, 163.com}.

  • if you have any good suggestions, you can PR or email me.

More Repositories

1

Awesome-ChatGPT

ChatGPT资料汇总学习,持续更新......
4,043
star
2

cnn-lstm-bilstm-deepcnn-clstm-in-pytorch

In PyTorch Learing Neural Networks Likes CNN、BiLSTM
Python
1,203
star
3

Awesome-Law-NLP-Research-Work

Awesome Law NLP Research Work, Paper, Competition, Onlline System
407
star
4

cw2vec

cw2vec: Learning Chinese Word Embeddings with Stroke n-gram Information
C++
275
star
5

pytorch_word2vec

Use pytorch to implement word2vec
Python
147
star
6

Word_Similarity_and_Word_Analogy

Word Similarity and Word Analogy Task scripts
Python
72
star
7

pytorch_Highway_Networks

Highway Networks implement in pytorch
Python
71
star
8

pytorch_SRU

SRU implement in pytorch(Training RNNs as Fast as CNNs)
Python
42
star
9

pytorch_Joint-Word-Segmentation-and-POS-Tagging

Paper: A Simple and Effective Neural Model for Joint Word Segmentation and POS Tagging
Python
36
star
10

PyTorch_Bert_Text_Classification

PyTorch Bert Text Classification
Python
31
star
11

Legal_Judgment_Prediction_BiLSTM_ATT

Legal Juegment Prediction (LJP) with BiLSTM and Attention
Python
14
star
12

PyTorch-Bert-BiLSTM-ATT-LJP

PyTorch-Bert-BiLSTM-ATT-LJP
Python
14
star
13

PyTorch_Chinese_word_segmentation

Chinese word segmentation with the neural seq2seq model implement in pytorch
Python
9
star
14

pytorch_text_classification

text classification with my own architecture
Python
8
star
15

pytorch_Sequence_Label

Sequence Label(NER: Named Entity Recognition) implement in pytorch
Python
4
star
16

pytorch_CNN_LSTM

CNN LSTM implement in pytorch
Python
4
star
17

pytorch_Embedding_Packed

package the function of nn.Embedding, nn.Dropout() in pytorch for use
Python
3
star
18

SVM_TFIDF_LJP

legal juegement prediction with SVM_TFIDF
Python
2
star
19

pytorch_Joint-Word-Segmentation-And-POS-Tagging-old

pytorch_seq2seq_wordseg_and_postag
Python
2
star
20

pytorch_POS_NER_Chunking

Part-of-Speech Tagging(POS), Named Entity Recognition(NER) and Chunking implement in pytorch
Python
2
star
21

pytorch_document_classification

Text classification on document level implement in pytorch
Python
1
star
22

Python

Python
1
star
23

Cpp_extract_giga_word_pair

extract_giga_word_pair implement in c++
C++
1
star
24

pytorch_sentence_classification

Text classification for sentence level that implement in pytorch
Python
1
star
25

pytorch_seq2seq_wordseg_and_postag_version2

pytorch_seq2seq_wordseg_and_postag_version2
Python
1
star
26

word2vec

word2vec implement in c++ and in pytorch
C++
1
star