• Stars
    star
    101
  • Rank 338,166 (Top 7 %)
  • Language
    Python
  • Created over 4 years ago
  • Updated almost 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Deep Multifaceted Transformers for Multi-objective Ranking in Large-Scale E-commerce Recommender Systems, CIKM 2020

Deep Multifaceted Transformers for Multi-objective Ranking in Large-Scale E-commerce Recommender Systems (CIKM 2020)

DMT:

DMT_code is the code for the paper "Deep Multifaceted Transformers for Multi-objective Ranking in Large-Scale E-commerce Recommender Systems", which is published in CIKM 2020.

The proposed framework

Requirements

python==2.7

tensorflow==1.12

Usage

sh run.sh

Datasets:

JD Recsys Dataset.

Statistics:

Type Total Sampled Impressions Clicks Orders
Train 667,907,650 622,596,211 43,876,602 1,434,837
Test 105,444,671 98,732,799 6,477,409 234,463

Download link:

The dataset can be downloaded from: https://drive.google.com/drive/folders/1Dnlnnzl2QD2mYP3o0icSxNVrl6nCvlT0?usp=sharing. The files are in the format of TFRecord and can be placed in HDFS for training.

The shared dataset is sampled from the 0.7 billion dataset used in the paper.

Description:

The datasets are used in "Deep Multifaceted Transformers for Multi-objective Ranking in Large-Scale E-commerce Recommender Systems", which is published in CIKM 2020.

In this paper, the two tasks are click prediction and order prediction.

The goal of click prediction is to predict the CTR.

The goal of order prediction is to predict the CTVR = CTR * CVR, which aims to eliminate the sample bias problem in CVR task [1] .

CTR: Click-Through Rate, CVR: ConVersion Rate.

Type Click prediction (CTR) Order prediction (CTVR)
Impression 0 0
Click 1 0
Order 1 1

Research Topics:

This dataset can be used for research on CTR prediction, CVR prediction, multi-task ranking, sequential modeling, unbiased ranking in Recommender Systems. This dataset should only be used for research purpose!

Citation:

Please cite the following paper if you use the data in any way.

@inproceedings{gu2020dmt,
  title={Deep Multifaceted Transformers for Multi-objective Ranking in Large-Scale E-commerce Recommender Systems},
  author={Gu, Yulong and Ding, Zhuoye and Wang, Shuaiqiang and Zou, Lixin and Yiding Liu and Yin, Dawei},
  booktitle={CIKM'20},
  year={2020}
}

File Description:

The dataset files are in the format of TFRecord.

Each line in the TFRecord files contains the ranking features and label in the Recommender System.

The ranking features contains 615 dense features and some id features.

The labels are impressions (labels=0), clicks (labels=1 or 2) and orders (labels=4 or 5).

feature type name desc
Dense features features 615 dimension dense features. They contain item profile features (e.g.,number of clicks, CTR, CVR, rating) , use profile features (e.g.,prefered categories and brands, purchase power), user-item matching features (e.g.,whether the item matches the user’s gender or age) and user-item interaction features (e.g.,number of clicks on thecategory of the item within a time window).
Categorical features item_fea_sku id of the product
item_c2 second level category id of the product
item_c3 third level category id of the product
item_brand brand id of the product
item_shop shop id of the product
clk_seq clk_seq_sku_7d_50 sequence of ids of the products in the click sequence (latest 50 clicks in recent 7 days)
clk_seq_ts_7d_50 sequence of timestamps in the click sequence (latest 50 clicks in recent 7 days)
clk_seq_c2_7d_50 sequence of second level category ids in the click sequence (latest 50 clicks in recent 7 days)
clk_seq_c3_7d_50 sequence of third level category ids in the click sequence (latest 50 clicks in recent 7 days)
clk_seq_brand_7d_50 sequence of brand ids in the click sequence (latest 50 clicks in recent 7 days)
clk_seq_shop_7d_50 sequence of shop ids in the click sequence (latest 50 clicks in recent 7 days)
ord_seq ord_seq_sku_12m_50 sequence of ids of the products in the purchase sequence (latest 50 orders in recent 12 months)
ord_seq_ts_12m_50 sequence of timestamps in the purchase sequence (latest 50 orders in recent 12 months)
ord_seq_c2_12m_50 sequence of second level category ids in the purchase sequence (latest 50 orders in recent 12 months)
ord_seq_c3_12m_50 sequence of third level category ids in the purchase sequence (latest 50 orders in recent 12 months)
ord_seq_brand_12m_50 sequence of brand ids in the purchase sequence (latest 50 orders in recent 12 months)
ord_seq_shop_12m_50 sequence of shop ids in the purchase sequence (latest 50 orders in recent 12 months)
cart_seq cart_seq_sku_12m_10 sequence of ids of the products in the cart sequence (latest 10 carts in recent 12 months)
cart_seq_ts_12m_10 sequence of timestamps in the cart sequence (latest 10 carts in recent 12 months)
cart_seq_c2_12m_10 sequence of second level category ids in the cart sequence (latest 10 carts in recent 12 months)
cart_seq_c3_12m_10 sequence of third level category ids in the cart sequence (latest 10 carts in recent 12 months)
cart_seq_brand_12m_10 sequence of brand ids in the cart sequence (latest 10 carts in recent 12 months)
cart_seq_shop_12m_10 sequence of shop ids in the cart sequence (latest 10 carts in recent 12 months)
Bias features near_expo_seq_c2 sequence of second level category ids of the neighboring exposured products of the product
near_expo_seq_c3 sequence of third level category ids of the neighboring exposured products of the product
page page number of the product exposed in the page
position position number of the product exposed in the page

References:

[1] Ma, Xiao, Liqin Zhao, Guan Huang, Zhi Wang, Zelin Hu, Xiaoqiang Zhu, and Kun Gai. "Entire space multi-task model: An effective approach for estimating post-click conversion rate." SIGIR'2018.

More Repositories

1

Awesome-Deep-Learning-Papers-for-Search-Recommendation-Advertising

Awesome Deep Learning papers for industrial Search, Recommendation and Advertising. They focus on Embedding, Matching, Ranking (CTR and CVR prediction), Post Ranking, Multi-task Learning, Graph Neural Networks, Transfer Learning, Reinforcement Learning, Self-supervised Learning and so on.
Python
1,131
star
2

Awesome-Deep-Reinforcement-Learning-Papers-for-Search-Recommendation-Advertising

Awesome Deep Reinforcement Learning papers for industrial Search, Recommendation and Advertising.
186
star
3

WSDM2020_HUP

Hierarchical User Profiling for E-commerce RecommenderSystems, WSDM 2020
Python
44
star
4

Awesome-Self-supervised-Learning-papers

Awesome Self-supervised papers in CV, NLP, Graph, Recommendation, ML and so on.
Python
34
star
5

IJCAI2019_HGAT

Semi-supervised User Profiling with Heterogeneous Graph Attention Networks, IJCAI 19
21
star
6

Awesome-Embedding

Paper list for Network Embedding, Knowledge Base Embedding, Graph Neural Networks
21
star
7

AAAI2021_ANPP

Attentive Neural Point Processes for Event Forecasting, AAAI 2021
Python
17
star
8

SIGIR2020_NICF

Neural Interactive Collaborative Filtering, SIGIR 2020
Python
8
star
9

CIKM2020_DecGCN

Decoupled Graph Convolution Network for Inferring Substitutable and Complementary Items, CIKM 2020
Python
7
star
10

Deep-Reinforcement-Learning-Papers

Awesome papers in DRL
2
star
11

Awesome-Transformers

Awesome Transformers papers.
Python
2
star
12

Deep-Reinforcement-Learning-Materials

Materials for DRL
1
star
13

ICDM2016_HLGPS

HLGPS: A Home Location Global Positioning System in Location-Based Social Networks, ICDM 2016
Python
1
star