megagonlabs/asdc

Stars
23
Rank 1,016,327 (Top 21 %)
Language
Python
License
Creative Commons ...
Created over 2 years ago
Updated 10 months ago

megagonlabs/asdc

megagonlabs

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Accommodation Search Dialog Corpus (宿泊施設探索対話コーパス)

ginza

A Japanese NLP Library using spaCy as framework based on Universal Dependencies

HappyDB

A corpus of 100,000 happy moments

ditto

Code for the paper "Deep Entity Matching with Pre-trained Language Models"

bunkai

Sentence boundary disambiguation tool for Japanese texts (日本語文境界判定器)

sato

Code and data for Sato https://arxiv.org/abs/1911.06311.

jrte-corpus

Japanese Realistic Textual Entailment Corpus (NLP 2020, LREC 2020)

opiniondigest

OpinionDigest: A Simple Framework for Opinion Summarization (ACL 2020)

vecscan

SubjQA

A question-answering dataset with a focus on subjective information

t5-japanese

Codes to pre-train Japanese T5 models

ruler

Data Programming by Demonstration (DPBD) for Document Classification

Jupyter Notebook

tagruler

Data programming by demonstration for information extraction and span annotation

coop

☘️ Code for Convex Aggregation for Opinion Summarization (Iso et al; Findings of EMNLP 2021)

doduo

Annotating Columns with Pre-trained Language Models

instruction_ja

Japanese instruction data (日本語指示データ)

rotom

Code for the paper "Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond"

cocosum

🥥 Code & Data for Comparative Opinion Summarization via Collaborative Decoding (Iso et al; Findings of ACL 2022)

ebe-dataset

Evidence-based Explanation Dataset (AACL-IJCNLP 2020)

ginza-transformers

Use custom tokenizers in spacy-transformers

teddy

Code and data for Teddy https://arxiv.org/abs/2001.05171.

zett

🙈 Code for Zero-shot Triplet Extraction by Template Infilling (Kim et al; IJCNLP-AACL 2023)

machamp

The dataset for the paper "Machamp: A Generalized Entity Matching Benchmark" published in CIKM 2021

starmie

Resources for PVLDB 2023 submission

meganno-client

sudowoodo

The source code of the Sudowoodo paper in ICDE 2023

Jupyter Notebook

explainit

desuwa

Feature annotator to morphemes and phrases based on KNP rule files (pure-Python)

react-jupyter-cookiecutter

xatu

🕊️ Code and Data for XATU: A Fine-grained Instruction-based Benchmark for Explainable Text Updates (Zhang et al; LREC-COLING 2024)

magneton

Repository of the Magneton framework for authoring interaction-aware and customizable widgets.

emu

Enhancing Multilingual Sentence Embeddings with Semantic Specialization (AAAI '20)

learnit

A Tool for Machine Learning Beginners

leam

Source code and demo for Leam

Jupyter Notebook

minun

Evaluating Counterfactual Explanations for Entity Matching

llm-longeval

💵 Code for Less is More for Long Document Summary Evaluation by LLMs (Wu, Iso et al; EACL 2024)

jrte-corpus_example

Example codes for Japanese Realistic Textual Entailment Corpus

Tyrogue

Jupyter Notebook

qa-summarization

Ting-Yao's intern project

pilota

✈ SCUD generator (解釈文生成器)

quasi_japanese_reviews

Quasi Japanese Reviews (擬似レビューデータ)

MCR

witqa