• Stars
    star
    290
  • Rank 142,104 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created over 6 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Wide and Deep Learning for CTR Prediction in tensorflow

Wide and Deep Learning for CTR Prediction in tensorflow

Overview

A general Wide and Deep Joint Learning Framework. Deep part can be a simple Dnn, Dnn Variants(ResDnn, DenseDnn), MultiDnn or even combine with Cnn (Dnn-Cnn).

Here, we use the wide and deep model to predict the click labels. The wide model is able to memorize interactions with data with a large number of features but not able to generalize these learned interactions on new data. The deep model generalizes well but is unable to learn exceptions within the data. The wide and deep model combines the two models and is able to generalize while learning exceptions.

The code uses the high level tf.estimator.Estimator API. This API is great for fast iteration and quickly adapting models to your own datasets without major code overhauls. It allows you to move from single-worker training to distributed training, and it makes it easy to export model binaries for prediction.

The input function for the Estimator uses tf.data.Dataset API, which creates a Dataset object. The Dataset API makes it easy to apply transformations (map, batch, shuffle, etc.) to the data. Read more here.

The code is based on the TensorFlow wide and deep tutorial.

Extensions

  1. provide very flexible feature configuration and train configuration.
  2. scalable to arbitrarily train data size in production environment.
  3. support multi value feature input (multihot).
  4. support distributed tensorflow
  5. support custom dnn network (arbitrary connections between layers) with flexible options.
  6. add BN layer; activation_fn; l1,l2 reg; weight decay lr options for training.
  7. support dnn, multidnn joint learning, even combine with cnn.
  8. support 3 types normalization for continuous features.
  9. support weight column for imbalance sample.
  10. provide tensorflow serving for tf.estimator.
  11. provide scripts to do data proprocess using pyspark (generate continuous features from category features).

Running the code

Setup

cd conf
vim feature.yaml
vim model.yaml
vim train.yaml
...

Training

You can run the code locally as follows:

cd python
python train.py

or use shell scripts as follows:

cd scripts
bash train.sh

Testing

python eval.py

or use shell scripts as follows:

bash test.sh

Distributed Training

run the code on ps as follows:

cd scripts
bash run_ps.sh

TensorBoard

Run TensorBoard to inspect the details about the graph and training progression.

tensorboard --logdir=./model/wide_deep

More Repositories

1

atec-nlp

ATEC ้‡‘่žๅคง่„‘-้‡‘่žๆ™บ่ƒฝNLPๆœๅŠก
Python
87
star
2

Wide-ResDNN

Wide and Deep Learning(Wide&ResDNN) for Kaggle Criteo Dataset in tensorflow
Python
82
star
3

TransE-Knowledge-Graph-Embedding

TensorFlow implementation of TransE and its extended models for Knowledge Representation Learning
Python
79
star
4

xinci

ๆ–ฐ่ฏๅ‘็Žฐ Chinese Words Extraction & New Words Finder (Python package).
Python
22
star
5

FM

using FM latent vectors as embedding features
Python
13
star
6

ABCNN

TensorFlow Implementation of ABCNN (ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs)
Python
8
star
7

Leetcode

find the solutions for the problems on leetcode with python c++ and java
Python
8
star
8

fast-xinci

ๆ–ฐ่ฏๅ‘็Žฐ Chinese New Words Finder (c++ library).
C++
8
star
9

ctrcount

python map reduce statics
Python
4
star
10

DeepCQA

Comunity Question Answer(CQA) Deep Learning Model Collections
Python
4
star
11

Structured-Self-Attentive-Sentence-Embedding

TensorFlow Implementation of Structured-Self-Attentive-Sentence-Embedding.
Python
4
star
12

Interview

Online programming problems for my Interviews.
Python
4
star
13

wide_deep_demo

Distributed official wide and deep model
Python
3
star
14

WeChat

Automatically send your regards and weather info to your important WeChat friend
Python
3
star
15

Kaggle_TalkingData

TalkingData AdTracking Fraud Detection Challenge
Python
2
star
16

JData

High potential users purchase prediction using several methods
R
2
star
17

java-examples

Java examples.
Java
1
star
18

PageFlow

PageFlow is a Python library for webpage search result crawler/spider. ็ฝ‘้กตๆœ็ดข็ป“ๆžœ็ˆฌ่™ซ
Python
1
star
19

machine-learning-basic

Some basic machine learning algorithm demos.
Python
1
star
20

Multiplicative-Neural-Network

Multiplicative-Neural-Network (MNN) implemented in TensorFlow estimator API.
Python
1
star