HKU NLP Group (@HKUNLP)
  • Stars
    star
    1,287
  • Global Org. Rank 12,983 (Top 5 %)
  • Followers 103
  • Registered about 3 years ago
  • Most used languages
    Python
    93.3 %
    JavaScript
    6.7 %
  • Location 🇭🇰 Hong Kong
  • Country Total Rank 87
  • Country Ranking
    Python
    26

Top repositories

1

UnifiedSKG

[EMNLP 2022] A Unified Framework and Analysis for Structured Knowledge Grounding with Text-to-Text Language Models
Python
480
star
2

instructor-embedding

One Embedder, Any Task: Instruction-Finetuned Text Embeddings
Python
270
star
3

Binder

[ICLR 2023] Code for the paper "Binding Language Models in Symbolic Languages"
Python
161
star
4

DS-1000

[ICML 2023] Official data and code release for the paper "DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation".
Python
119
star
5

HumanPrompt

A framework for human-readable prompt-based method with large language models. Specially designed for researchers. (In progress)
Python
103
star
6

diffusion-of-thoughts

Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"
Python
61
star
7

icl-selective-annotation

[ICLR 2023] Code for our paper "Selective Annotation Makes Language Models Better Few-Shot Learners"
Python
57
star
8

efficient-attention

[EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control variates, random features, and importance sampling
Python
39
star
9

icl-ceil

Code for our paper “Compositional Exemplars for In-context Learning”.
Python
34
star
10

reparam-discrete-diffusion

Reparameterized Discrete Diffusion Models for Text Generation
Python
30
star
11

STRING

Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"
Python
28
star
12

batch-prompting

A simple prompting approach that enables the LLMs to run inference in batches.
Python
24
star
13

subgoal-theorem-prover

Code for the paper "Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving"
16
star
14

ProGen

[EMNLP-2022 Findings] Code for paper “ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback”.
Python
12
star
15

hkunlp.github.io

Website for HKU NLP group (under construction)
JavaScript
9
star
16

diagrams_toolkit

Source code for diagrams in the paper of NLPers from HKU.
Python
4
star
17

ChunkLlama

Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
4
star
18

SymGen

Code for Generating Data for Symbolic Language with Large Language Models
2
star
19

.github

2
star
20

diffusion-vs-ar

1
star