• Stars
    star
    163
  • Rank 231,141 (Top 5 %)
  • Language
    Python
  • License
    MIT License
  • Created over 2 years ago
  • Updated 12 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[KDD'22] Official PyTorch implementation for "Towards Universal Sequence Representation Learning for Recommender Systems".

UniSRec

This is the official PyTorch implementation for the paper:

Yupeng Hou*, Shanlei Mu*, Wayne Xin Zhao, Yaliang Li, Bolin Ding, Ji-Rong Wen. Towards Universal Sequence Representation Learning for Recommender Systems. KDD 2022.


Updates:

  • [Nov. 22, 2022] We added scripts and implementations of baselines FDSA and S^3-Rec [link].
  • [June 28, 2022] We updated some useful "mid product" files that can be obtained during the data preprocessing stage [link], including:
    1. Clean item text (*.text);
    2. Index mapping between raw IDs and remapped IDs (*.user2index, *.item2index);
  • [June 16, 2022] We released the code and scripts for preprocessing ours datasets [link].

Overview

We propose UniSRec, which stands for Universal Sequence representation learning for Recommendation. Aiming to learn more generalizable sequence representations, UniSRec utilizes the associated description text of an item to learn transferable representations across different domains and platforms. For learning universal item representations, we design a lightweight architecture based on parametric whitening and mixture-of-experts enhanced adaptor. For learning universal sequence representations, we introduce two kinds of contrastive learning tasks by sampling multi-domain negatives. With the pre-trained universal sequence representation model, our approach can be effectively transferred to new cross-domain and cross-platform recommendation scenarios in a parameter-efficient way, under either inductive or transductive settings.

Requirements

recbole==1.0.1
python==3.9.7
cudatoolkit==11.3.1
pytorch==1.11.0

Download Datasets and Pre-trained Model

Please download the processed downstream (or pre-trained, if needed) datasets and the pre-trained model from Google Drive or 百度网盘 (密码 3cml).

After unzipping, move pretrain/ and downstream/ to dataset/, and move UniSRec-FHCKM-300.pth to saved/.

Quick Start

Train and evaluate on downstream datasets

Fine-tune the pre-trained UniSRec model in transductive setting.

python finetune.py -d Scientific -p saved/UniSRec-FHCKM-300.pth

You can replace Scientific to Pantry, Instruments, Arts, Office or OR to reproduce the results reported in our paper.

Fine-tune the pre-trained model in inductive setting.

python finetune.py -d Scientific -p saved/UniSRec-FHCKM-300.pth --train_stage=inductive_ft

Train UniSRec from scratch (w/o pre-training).

python finetune.py -d Scientific

Run baseline SASRec.

python run_baseline.py -m SASRec -d Scientific --config_files=props/finetune.yaml --hidden_size=300

Please refer to [link] for more scripts of our baselines.

Pre-train from scratch

Pre-train on one single GPU.

python pretrain.py

Pre-train with distributed data parallel on GPU:0-3.

CUDA_VISIBLE_DEVICES=0,1,2,3 python ddp_pretrain.py

Customized Datasets

Please refer to [link] for details of data preprocessing. Then you can correspondingly try your customized datasets.

Acknowledgement

The implementation is based on the open-source recommendation library RecBole.

Please cite the following papers as the references if you use our codes or the processed datasets.

@inproceedings{hou2022unisrec,
  author = {Yupeng Hou and Shanlei Mu and Wayne Xin Zhao and Yaliang Li and Bolin Ding and Ji-Rong Wen},
  title = {Towards Universal Sequence Representation Learning for Recommender Systems},
  booktitle = {{KDD}},
  year = {2022}
}


@inproceedings{zhao2021recbole,
  title={Recbole: Towards a unified, comprehensive and efficient framework for recommendation algorithms},
  author={Wayne Xin Zhao and Shanlei Mu and Yupeng Hou and Zihan Lin and Kaiyuan Li and Yushuo Chen and Yujie Lu and Hui Wang and Changxin Tian and Xingyu Pan and Yingqian Min and Zhichao Feng and Xinyan Fan and Xu Chen and Pengfei Wang and Wendi Ji and Yaliang Li and Xiaoling Wang and Ji-Rong Wen},
  booktitle={{CIKM}},
  year={2021}
}

Special thanks @Juyong Jiang for the excellent DDP implementation (#961).

More Repositories

1

LLMSurvey

The official GitHub page for the survey paper "A Survey of Large Language Models".
Python
10,176
star
2

RecBole

A unified, comprehensive and efficient recommendation library
Python
3,387
star
3

TextBox

TextBox 2.0 is a text generation library with pre-trained language models
Python
1,073
star
4

Awesome-RSPapers

Recommender System Papers
937
star
5

RecSysDatasets

This is a repository of public data sources for Recommender Systems (RS).
Python
808
star
6

LLMBox

A comprehensive library for implementing LLMs, including a unified training pipeline and comprehensive model evaluation.
Python
599
star
7

CRSLab

CRSLab is an open-source toolkit for building Conversational Recommender System (CRS).
Python
496
star
8

HaluEval

This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.
Python
392
star
9

Top-conference-paper-list

A collection of classified and organized top conference paper list.
360
star
10

LLMRank

[ECIR'24] Implementation of "Large Language Models are Zero-Shot Rankers for Recommender Systems"
Python
229
star
11

DenseRetrieval

200
star
12

Negative-Sampling-Paper

This repository collects 100 papers related to negative sampling methods.
185
star
13

RecBole2.0

An up-to-date, comprehensive and flexible recommendation library
180
star
14

RecBole-GNN

Efficient and extensible GNNs enhanced recommender library based on RecBole.
Python
170
star
15

NCL

[WWW'22] Official PyTorch implementation for "Improving Graph Collaborative Filtering with Neighborhood-enriched Contrastive Learning".
Python
117
star
16

RSPapers

Must-read papers on Recommender System. 推荐系统相关论文整理(内含40篇论文,并持续更新中)
89
star
17

RecBole-CDR

This is a library built upon RecBole for cross-domain recommendation algorithms
Python
85
star
18

MVP

This repository is the official implementation of our paper MVP: Multi-task Supervised Pre-training for Natural Language Generation.
68
star
19

VQ-Rec

[WWW'23] PyTorch implementation for "Learning Vector-Quantized Item Representation for Transferable Sequential Recommenders".
Python
62
star
20

RecBole-PJF

Python
51
star
21

Language-Specific-Neurons

Python
42
star
22

ChatCoT

The official repository of "ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models"
Python
41
star
23

CORE

[SIGIR'22] Official PyTorch implementation for "CORE: Simple and Effective Session-based Recommendation within Consistent Representation Space".
Python
37
star
24

BAMBOO

Python
32
star
25

JiuZhang3.0

The code and data for the paper JiuZhang3.0
Python
32
star
26

Multi-View-Co-Teaching

Code for our CIKM 2020 paper "Learning to Match Jobs with Resumes from Sparse Interaction Data using Multi-View Co-Teaching Network"
Python
29
star
27

JiuZhang

Our code will be public soon .
Python
26
star
28

ELMER

This repository is the official implementation of our EMNLP 2022 paper ELMER: A Non-Autoregressive Pre-trained Language Model for Efficient and Effective Text Generation
Python
26
star
29

RecBole-DA

Python
20
star
30

CARP

Python
16
star
31

SAFE

The pytorch implementation of the SAFE model presented in NAACL-Findings-2022
Python
16
star
32

Erya

14
star
33

RecBole-TRM

Python
13
star
34

MML

Python
12
star
35

Context-Tuning

This is the repository for COLING 2022 paper "Context-Tuning: Learning Contextualized Prompts for Natural Language Generation".
11
star
36

UniWeb

The official repository for our ACL 2023 Findings paper: The Web Can Be Your Oyster for Improving Language Models
10
star
37

FIGA

[ICLR 2024] This is the official implementation for the paper: "Beyond imitation: Leveraging fine-grained quality signals for alignment"
Python
8
star
38

PPGM

[ICDM'22] PyTorch implementation for "Privacy-Preserved Neural Graph Similarity Learning".
Python
6
star
39

Social-Datasets

A collection of social datasets for RecBole-GNN.
6
star
40

Contrastive-Curriculum-Learning

Python
5
star
41

LIVE

The official repository our ACL 2023 paper: "Learning to Imagine: Visually-Augmented Natural Language Generation"."
Python
5
star
42

ALLO

The official repository of "Low-Redundant Optimization for Large Language Model Alignment''
Python
5
star
43

M3SRec

4
star
44

Data-CUBE

3
star
45

Div-Ref

The official repository of "Not All Metrics Are Guilty: Improving NLG Evaluation Diversifying References".
Python
3
star
46

GenRec

Python
1
star
47

ETRec

Python
1
star
48

xLSTM-LSR

Python
1
star
49

MoL-TSR

Python
1
star
50

L2P-CSR

The implementation code of the TASLP 2023 paper "Learning to Perturb for Contrastive Learning of Unsupervised Sentence Representations"
Python
1
star