• Stars
    star
    208
  • Rank 189,015 (Top 4 %)
  • Language
    Python
  • License
    MIT License
  • Created over 8 years ago
  • Updated almost 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

TextRank for Korean.

textrankr

Build Status Coverage Status PyPI version

Reorder sentences using TextRank algorithm.

  • Mostly designed for Korean, but not limited to.
  • Check out lexrankr, which is another awesome summarizer!
  • Not available for Python 2 anymore (if necessary, use version 0.3).

Installation

pip install textrankr

Tokenizers

Tokenizers are not included. You have to implement one by yourself.

Example:

from typing import List

class MyTokenizer:
    def __call__(self, text: str) -> List[str]:
        tokens: List[str] = text.split()
        return tokens

ν•œκ΅­μ–΄μ˜ 경우 KoNLPyλ₯Ό μ‚¬μš©ν•˜λŠ” 방법이 μžˆμŠ΅λ‹ˆλ‹€. μ•„λž˜ μ˜ˆμ‹œμ²˜λŸΌ phrasesλ₯Ό μ“°κ²Œλ˜λ©΄ μ—„λ°€νžˆλŠ” ν† ν¬λ‚˜μ΄μ €κ°€ μ•„λ‹ˆμ§€λ§Œ 이게 더 쒋은 κ²°κ³Όλ₯Ό μ£ΌλŠ”κ²ƒ κ°™μŠ΅λ‹ˆλ‹€.

from typing import List
from konlpy.tag import Okt

class OktTokenizer:
    okt: Okt = Okt()

    def __call__(self, text: str) -> List[str]:
        tokens: List[str] = self.okt.phrases(text)
        return tokens

Usage

from typing import List
from textrankr import TextRank

mytokenizer: MyTokenizer = MyTokenizer()
textrank: TextRank = TextRank(mytokenizer)

k: int = 3  # num sentences in the resulting summary

summarized: str = textrank.summarize(your_text_here, k)
print(summarized)  # gives you some text

# if verbose = False, it returns a list
summaries: List[str] = textrank.summarize(your_text_here, k, verbose=False)
for summary in summaries:
    print(summary)

Test

Use docker.

docker build -t textrankr -f Dockerfile .
docker run --rm -it textrankr

More Repositories

1

pytorch-sgns

Skipgram Negative Sampling implemented in PyTorch
Python
300
star
2

NotoSansKR-Hestia

κ²½λŸ‰ν™”λœ λ…Έν†  μ‚°μŠ€ ν•œκΈ€ 폰트.
CSS
102
star
3

lexrankr

LexRank for Korean.
Python
63
star
4

sci-news-sum-kr-50

넀이버 λ‰΄μŠ€ 쀑 IT/κ³Όν•™ λΆ„μ•Όμ—μ„œ 50개λ₯Ό μ„ μ •ν•΄μ„œ μš”μ•½μ— ν•΄λ‹Ήν•˜λŠ” λ¬Έμž₯을 νƒœκΉ…ν•΄λ‘” λ°μ΄ν„°μ…‹μž…λ‹ˆλ‹€.
39
star
5

session-aware-bert4rec

Official repository for "Exploiting Session Information in BERT-based Session-aware Sequential Recommendation", SIGIR 2022 short.
Python
38
star
6

kata

Let's study.
Python
20
star
7

pocket-galaxy

λ‚΄λΆ€ μž‘μ—…μš© django + vue(vuetify) boilerplate. μ§  ν•˜λ©΄ λŒμ•„κ°.
Python
18
star
8

Love2Live

CVAE based School Idol image generation. Published in proc. of SSCC 2nd, 2017.
Python
11
star
9

CMYK2RGB

You'll need this when converting Adobe Photoshop's CMYK to RGB.
Python
9
star
10

infinite-monkey-sort

Simple integer sorting algorithm based on infinite monkey theorem.
Python
9
star
11

ProxyRCA

Official repository for "Proxy-based Item Representation for Attribute and Context-aware Recommendation", WSDM 2024.
Python
9
star
12

bear

Implementation of BEAR and SlashBurn.
Python
7
star
13

wpe

Word Pair Encoding (WPE) for semi-automatic meaningful-keywords generation.
Python
6
star
14

pytorch-quadratum

Additional torchvision image transforms for practical usage.
Python
6
star
15

docker-ubuntu-konlpy

Docker image of latest Ubuntu for KoNLPy on Python 3.
Dockerfile
4
star
16

docker-pytorch-ko

Docker image of latest PyTorch-CUDA for Koreans...
Dockerfile
3
star
17

Kara

Kara the Coda 2 plugin for dealing with annoying blanks.
Python
2
star
18

dnwc

disable naver webtoon comment
JavaScript
2
star
19

sscc-1st

Source code for paper submitted by @theeluwin at Proceedings of SSCC 1st.
Python
1
star
20

basehangul-lua

Human-readable binary encoding for Lua
Lua
1
star
21

kaggle-hnm-preprocess

IDS 연ꡬ싀 2022년도 2ν•™κΈ° UROP κ³Όλͺ©μš© 자료. Kaggle H&M μ±Œλ¦°μ§€λ₯Ό μœ„ν•œ μ „μ²˜λ¦¬ μ½”λ“œ λͺ¨μŒ.
Jupyter Notebook
1
star