• Stars
    star
    1,798
  • Rank 24,784 (Top 0.6 %)
  • Language
    Python
  • License
    MIT License
  • Created about 6 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Keras Temporal Convolutional Network.

Keras TCN

Keras Temporal Convolutional Network. [paper]

Tested with Tensorflow 2.9, 2.10, 2.11, 2.12, 2.13, 2.14 and 2.15 (Nov 17, 2023).

Downloads Downloads Keras TCN CI

pip install keras-tcn
pip install keras-tcn --no-dependencies  # without the dependencies if you already have TF/Numpy.

For MacOS M1 users: pip install --no-binary keras-tcn keras-tcn. The --no-binary option will force pip to download the sources (tar.gz) and re-compile them locally. Also make sure that grpcio and h5py are installed correctly. There are some tutorials on how to do that online.

Why TCN (Temporal Convolutional Network) instead of LSTM/GRU?

  • TCNs exhibit longer memory than recurrent architectures with the same capacity.
  • Performs better than LSTM/GRU on long time series (Seq. MNIST, Adding Problem, Copy Memory, Word-level PTB...).
  • Parallelism (convolutional layers), flexible receptive field size (how far the model can see), stable gradients (compared to backpropagation through time, vanishing gradients)...

Visualization of a stack of dilated causal convolutional layers (Wavenet, 2016)

TCN Layer

TCN Class

TCN(
    nb_filters=64,
    kernel_size=3,
    nb_stacks=1,
    dilations=(1, 2, 4, 8, 16, 32),
    padding='causal',
    use_skip_connections=True,
    dropout_rate=0.0,
    return_sequences=False,
    activation='relu',
    kernel_initializer='he_normal',
    use_batch_norm=False,
    use_layer_norm=False,
    use_weight_norm=False,
    go_backwards=False,
    return_state=False,
    **kwargs
)

Arguments

  • nb_filters: Integer. The number of filters to use in the convolutional layers. Would be similar to units for LSTM. Can be a list.
  • kernel_size: Integer. The size of the kernel to use in each convolutional layer.
  • dilations: List/Tuple. A dilation list. Example is: [1, 2, 4, 8, 16, 32, 64].
  • nb_stacks: Integer. The number of stacks of residual blocks to use.
  • padding: String. The padding to use in the convolutions. 'causal' for a causal network (as in the original implementation) and 'same' for a non-causal network.
  • use_skip_connections: Boolean. If we want to add skip connections from input to each residual block.
  • return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence.
  • dropout_rate: Float between 0 and 1. Fraction of the input units to drop.
  • activation: The activation used in the residual blocks o = activation(x + F(x)).
  • kernel_initializer: Initializer for the kernel weights matrix (Conv1D).
  • use_batch_norm: Whether to use batch normalization in the residual layers or not.
  • use_layer_norm: Whether to use layer normalization in the residual layers or not.
  • use_weight_norm: Whether to use weight normalization in the residual layers or not.
  • go_backwards: Boolean (default False). If True, process the input sequence backwards and return the reversed sequence.
  • return_state: Boolean. Whether to return the last state in addition to the output. Default: False.
  • kwargs: Any other set of arguments for configuring the parent class Layer. For example "name=str", Name of the model. Use unique names when using multiple TCN.

Input shape

3D tensor with shape (batch_size, timesteps, input_dim).

timesteps can be None. This can be useful if each sequence is of a different length: Multiple Length Sequence Example.

Output shape

  • if return_sequences=True: 3D tensor with shape (batch_size, timesteps, nb_filters).
  • if return_sequences=False: 2D tensor with shape (batch_size, nb_filters).

How do I choose the correct set of parameters to configure my TCN layer?

Here are some of my notes regarding my experience using TCN:

  • nb_filters: Present in any ConvNet architecture. It is linked to the predictive power of the model and affects the size of your network. The more, the better unless you start to overfit. It's similar to the number of units in an LSTM/GRU architecture too.

  • kernel_size: Controls the spatial area/volume considered in the convolutional ops. Good values are usually between 2 and 8. If you think your sequence heavily depends on t-1 and t-2, but less on the rest, then choose a kernel size of 2/3. For NLP tasks, we prefer bigger kernel sizes. A large kernel size will make your network much bigger.

  • dilations: It controls how deep your TCN layer is. Usually, consider a list with multiple of two. You can guess how many dilations you need by matching the receptive field (of the TCN) with the length of features in your sequence. For example, if your input sequence is periodic, you might want to have multiples of that period as dilations.

  • nb_stacks: Not very useful unless your sequences are very long (like waveforms with hundreds of thousands of time steps).

  • padding: I have only used causal since a TCN stands for Temporal Convolutional Networks. Causal prevents information leakage.

  • use_skip_connections: Skip connections connects layers, similarly to DenseNet. It helps the gradients flow. Unless you experience a drop in performance, you should always activate it.

  • return_sequences: Same as the one present in the LSTM layer. Refer to the Keras doc for this parameter.

  • dropout_rate: Similar to recurrent_dropout for the LSTM layer. I usually don't use it much. Or set it to a low value like 0.05.

  • activation: Leave it to default. I have never changed it.

  • kernel_initializer: If the training of the TCN gets stuck, it might be worth changing this parameter. For example: glorot_uniform.

  • use_batch_norm, use_weight_norm, use_weight_norm: Use normalization if your network is big enough and the task contains enough data. I usually prefer using use_layer_norm, but you can try them all and see which one works the best.

Receptive field

The receptive field is defined as: the maximum number of steps back in time from current sample at time T, that a filter from (block, layer, stack, TCN) can hit (effective history) + 1. The receptive field of the TCN can be calculated using the formula:

where Nstack is the number of stacks, Nb is the number of residual blocks per stack, d is a vector containing the dilations of each residual block in each stack, and K is the kernel size. The 2 is there because there are two Conv1d layers in a single ResidualBlock.

Ideally you want your receptive field to be bigger than the largest length of input sequence, if you pass a sequence longer than your receptive field into the model, any extra values (further back in the sequence) will be replaced with zeros.

Examples

NOTE: Unlike the TCN, example figures only include a single Conv1d per layer, so the formula becomes Rfield = 1 + (K-1)β‹…Nstackβ‹…Ξ£i di (without the factor 2).

  • If a dilated conv net has only one stack of residual blocks with a kernel size of 2 and dilations [1, 2, 4, 8], its receptive field is 16. The image below illustrates it:

ks = 2, dilations = [1, 2, 4, 8], 1 block

  • If a dilated conv net has 2 stacks of residual blocks, you would have the situation below, that is, an increase in the receptive field up to 31:

ks = 2, dilations = [1, 2, 4, 8], 2 blocks

  • If we increased the number of stacks to 3, the size of the receptive field would increase again, such as below:

ks = 2, dilations = [1, 2, 4, 8], 3 blocks

Non-causal TCN

Making the TCN architecture non-causal allows it to take the future into consideration to do its prediction as shown in the figure below.

However, it is not anymore suitable for real-time applications.

Non-Causal TCN - ks = 3, dilations = [1, 2, 4, 8], 1 block

To use a non-causal TCN, specify padding='valid' or padding='same' when initializing the TCN layers.

Run

Once keras-tcn is installed as a package, you can take a glimpse of what is possible to do with TCNs. Some tasks examples are available in the repository for this purpose:

cd adding_problem/
python main.py # run adding problem task

cd copy_memory/
python main.py # run copy memory task

cd mnist_pixel/
python main.py # run sequential mnist pixel task

Reproducible results are possible on (NVIDIA) GPUs using the tensorflow-determinism library. It was tested with keras-tcn by @lingdoc.

Tasks

Word PTB

Language modeling remains one of the primary applications of recurrent networks. In this example, we show that TCN can beat LSTM on the WordPTB task, without too much tuning.


TCN vs LSTM (comparable number of weights)

Adding Task

The task consists of feeding a large array of decimal numbers to the network, along with a boolean array of the same length. The objective is to sum the two decimals where the boolean array contain the two 1s.

Explanation

Adding Problem Task

Implementation results

782/782 [==============================] - 154s 197ms/step - loss: 0.8437 - val_loss: 0.1883
782/782 [==============================] - 154s 196ms/step - loss: 0.0702 - val_loss: 0.0111
[...]
782/782 [==============================] - 152s 194ms/step - loss: 6.9630e-04 - val_loss: 3.7180e-04

Copy Memory Task

The copy memory consists of a very large array:

  • At the beginning, there's the vector x of length N. This is the vector to copy.
  • At the end, N+1 9s are present. The first 9 is seen as a delimiter.
  • In the middle, only 0s are there.

The idea is to copy the content of the vector x to the end of the large array. The task is made sufficiently complex by increasing the number of 0s in the middle.

Explanation

Copy Memory Task

Implementation results (first epochs)

118/118 [==============================] - 17s 143ms/step - loss: 1.1732 - accuracy: 0.6725 - val_loss: 0.1119 - val_accuracy: 0.9796
[...]
118/118 [==============================] - 15s 125ms/step - loss: 0.0268 - accuracy: 0.9885 - val_loss: 0.0206 - val_accuracy: 0.9908
118/118 [==============================] - 15s 125ms/step - loss: 0.0228 - accuracy: 0.9900 - val_loss: 0.0169 - val_accuracy: 0.9933

Sequential MNIST

Explanation

The idea here is to consider MNIST images as 1-D sequences and feed them to the network. This task is particularly hard because sequences are 28*28 = 784 elements. In order to classify correctly, the network has to remember all the sequence. Usual LSTM are unable to perform well on this task.

Sequential MNIST

Implementation results

1875/1875 [==============================] - 46s 25ms/step - loss: 0.0949 - accuracy: 0.9706 - val_loss: 0.0763 - val_accuracy: 0.9756
1875/1875 [==============================] - 46s 25ms/step - loss: 0.0831 - accuracy: 0.9743 - val_loss: 0.0656 - val_accuracy: 0.9807
[...]
1875/1875 [==============================] - 46s 25ms/step - loss: 0.0486 - accuracy: 0.9840 - val_loss: 0.0572 - val_accuracy: 0.9832
1875/1875 [==============================] - 46s 25ms/step - loss: 0.0453 - accuracy: 0.9858 - val_loss: 0.0424 - val_accuracy: 0.9862

References

Citation

@misc{KerasTCN,
  author = {Philippe Remy},
  title = {Temporal Convolutional Networks for Keras},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/philipperemy/keras-tcn}},
}

Contributors

More Repositories

1

keras-attention

Keras Attention Layer (Luong and Bahdanau scores).
Python
2,795
star
2

tensorflow-1.4-billion-password-analysis

Deep Learning model to analyze a large corpus of clear text passwords.
Python
1,875
star
3

yolo-9000

YOLO9000: Better, Faster, Stronger - Real-Time Object Detection. 9000 classes!
1,148
star
4

keract

Layers Outputs and Gradients in Keras. Made easy.
Python
1,032
star
5

deep-speaker

Deep Speaker: an End-to-End Neural Speaker Embedding System.
Python
864
star
6

n-beats

Keras/Pytorch implementation of N-BEATS: Neural basis expansion analysis for interpretable time series forecasting.
Python
810
star
7

name-dataset

The Python library for names.
Python
742
star
8

stanford-openie-python

Stanford Open Information Extraction made simple!
Python
605
star
9

deep-learning-bitcoin

Exploiting Bitcoin prices patterns with Deep Learning.
Python
516
star
10

FX-1-Minute-Data

HISTDATA - Dataset composed of all FX trading pairs / Crude Oil / Stock Indexes. Simple API to retrieve 1 Minute data Historical FX Prices (up to date).
Python
438
star
11

Deep-Learning-Tinder

Simple Tinder algorithm able to swipe left and right based on the recommendations of a pre-trained deep neural network (Machine Learning).
Python
274
star
12

timit

The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus.
270
star
13

cond_rnn

Conditional RNNs for Tensorflow / Keras.
Python
215
star
14

financial-news-dataset

Reuters and Bloomberg
211
star
15

my-first-bitcoin-miner

For the curious minds who want to understand how Bitcoin Blockchain works!
Python
185
star
16

expressvpn-python

ExpressVPN - Python Wrapper (IP auto switch).
Python
170
star
17

tensorflow-multi-dimensional-lstm

Multi dimensional LSTM as described in Alex Graves' Paper https://arxiv.org/pdf/0705.2011.pdf
Jupyter Notebook
155
star
18

tensorflow-class-activation-mapping

Learning Deep Features for Discriminative Localization (2016)
Python
151
star
19

easy-encryption

A very simple C++ module to encrypt/decrypt strings based on B64 and Vigenere ciper.
C++
138
star
20

Order-Book-Matching-Engine

Order Book Matching Engine for Stock Exchanges (1us latency for matching)
Java
135
star
21

tensorflow-phased-lstm

Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences (NIPS 2016) - Tensorflow 1.0
Python
131
star
22

tensorflow-ctc-speech-recognition

Application of Connectionist Temporal Classification (CTC) for Speech Recognition (Tensorflow 1.0 but compatible with 2.0).
Python
131
star
23

fractional-differentiation-time-series

As described in Advances of Machine Learning by Marcos Prado.
Python
121
star
24

amazon-reviews-scraper

Yet another multi language scraper for Amazon targeting reviews.
Python
109
star
25

lead-lag

Estimation of the lead-lag parameter from non-synchronous data.
Jupyter Notebook
98
star
26

google-news-scraper

Google News Scraper for languages like Japanese, Chinese... [VPN Support]
Python
94
star
27

stock-volatility-google-trends

Deep Learning Stock Volatility with Google Domestic Trends: https://arxiv.org/pdf/1512.04916.pdf
Python
89
star
28

japanese-words-to-vectors

Word2vec (word to vectors) approach for Japanese language using Gensim and Mecab.
Python
83
star
29

mercari-python-api

The Python Mercari API.
Python
78
star
30

Stanford-NER-Python

Stanford Named Entity Recognizer (NER) - Python Wrapper
Python
74
star
31

very-deep-convnets-raw-waveforms

Tensorflow - Very Deep Convolutional Neural Networks For Raw Waveforms - https://arxiv.org/pdf/1610.00087.pdf
Python
74
star
32

speaker-change-detection

Paper: https://arxiv.org/abs/1702.02285
Python
62
star
33

tensorflow-maxout

Maxout Networks TensorFlow implementation presented in https://arxiv.org/abs/1302.4389
Python
57
star
34

tensorflow-cnn-time-series

Feeding images of time series to Conv Nets! (Tensorflow + Keras)
Python
50
star
35

keras-seq2seq-example

Toy Keras implementation of a seq2seq model with examples.
Python
49
star
36

tensorflow-fifo-queue-example

Example on how to use a Tensorflow Queue to feed data to your models.
Python
39
star
37

3.7-billion-passwords-tools

Tools to manipulate the data behind Collection #1 (and #2–5) - AntiPublic.
Python
38
star
38

python-darknet-yolo-v4

Python to interface with Darknet Yolo V4 (multi GPU with load balancer supported).
Python
37
star
39

bitmex-liquidations

Minimal code to show how to receive the liquidations in realtime on Bitmex.
Python
33
star
40

Statistical-Arbitrage

Using Particle Markov Chain Monte Carlo
MATLAB
33
star
41

tensorflow-grid-lstm

Implementation of the paper https://arxiv.org/pdf/1507.01526v3.pdf (Tensorflow 1.0, Python 3)
Python
29
star
42

advanced-deep-learning-keras

File repository for the course [Advanced Deep Learning with Keras]. Packt Publishing.
Jupyter Notebook
28
star
43

vision-api

Google Vision API made easy!
Python
26
star
44

Facebook-Profile-Pictures-Downloader

πŸ˜† Download public profile pictures from Facebook.
Python
25
star
45

bitcoin-market-data

Largest tick market data for Bitcoin (mirror server of bitcoincharts.com).
Shell
24
star
46

NiceHash-api-monitoring-client

Simple NiceHash client to monitor your mining rigs. Configure alerts and emails!
Python
22
star
47

information-extraction-with-dominating-rules

Information extraction based on Stanford open IE Library and domination decision rules. http://philipperemy.github.io/information-extract/
Python
22
star
48

beer-dataset

The biggest beer database is in this repo!
Python
21
star
49

Market-Data

Module to retrieve realtime stock quotes of Paris stock exchange
Java
20
star
50

instant-music-playlist-downloader

Download MP3 songs from the web.
Python
20
star
51

Sentiment-Analysis-NLP

Sentiment Analysis applied to different datasets such as IMDB
Python
19
star
52

wavenet

A general TensorFlow implementation of the Wavenet network to be used to model long term sequences with less trainable parameters.
Python
18
star
53

keras-snail-attention

SNAIL Attention Block for Keras.
Python
17
star
54

which-of-your-friends-are-on-tinder

Discover which of your Facebook friends are on Tinder!
Python
16
star
55

LSTM-text-generation

Generating NEW Reuters articles from Reuters articles.
Python
16
star
56

keras-frn

Keras Filter Response Normalization Layer.
Python
15
star
57

keras-sde-net

Keras implementation of SDE-Net (ICML 2020).
Python
14
star
58

Candlestick-Chart-Generator

Candlestick Charts in JavaScript.
JavaScript
14
star
59

Peer-Group-Analysis-Clustering

Unsupervised Clustering of Time Series using Peer Group Analysis PGA
MATLAB
14
star
60

python-pubsub

A simple python implementation of a message router with many subscribers and many publishers.
Python
13
star
61

selenium-python-examples

Selenium examples in Python (web scraper).
Python
11
star
62

OrderBook-TWAP

Programming Test
C++
11
star
63

philipperemy.github.io

My blog.
SCSS
11
star
64

fxrt

Realtime FX prices from the Oanda broker.
Python
10
star
65

tensorflow-isan-rnn

Input Switched Affine Networks: An RNN Architecture Designed for Interpretability. http://proceedings.mlr.press/v70/foerster17a/foerster17a.pdf
Python
10
star
66

twitter-arxiv-sanity

Your daily "top hype" papers.
Python
9
star
67

japan-weather-forecast

Japanese Meteorological Agency (scraper + data)
Python
9
star
68

github-backup

Back up all your Github repositories in a directory.
Python
9
star
69

Leboncoin

Management of small ads (editing, publishing, deleting, re-publishing)
Java
9
star
70

cocktails

Generate the best cocktail ever with Machine Learning !
Python
8
star
71

Technical-Analysis

Technical Analysis Tool based on TA Lib
C
8
star
72

Ransac-Java

Implementation of the Ransac algorithm written in Java.
Java
8
star
73

Data-Mining-Automaton

Quantitative Algobox based on Data Mining techniques
Java
8
star
74

Github-full-data-set

Generating GitHub data (~1M repositories May 2017).
Python
8
star
75

GPU-Activity-Monitoring

Python monitoring tool for the nvidia-smi command on Linux.
Python
7
star
76

digital-setting-circles

Compatible with Raspberry Pi. Setting circles are used on telescopes equipped with an equatorial mount to find astronomical objects in the sky by their equatorial coordinates.
C++
7
star
77

japanese-street-addresses-scraper

Scraper for Japanese street addresses (住所).
Python
7
star
78

bitstamp-realtime-order-book

Gives you low latency access to Bitstamp Realtime Order Book.
Python
7
star
79

urban-dictionary-transformers

Transformers applied to Urban Dictionary for fun.
Python
7
star
80

arma-scipy-fit

Estimating coefficients of ARMA models with the Scipy package.
Python
7
star
81

binance-futures

Straightforward API endpoint to receive market data for Binance Futures.
Python
6
star
82

HFT-FIX-Parser

Ultra low latency FIX Parser
6
star
83

bitflyer

Bitflyer API Realtime Feed Python.
Python
6
star
84

Ogame-API

Ogame API
Java
6
star
85

keras-mode-normalization

Keras Implementation of Mode Normalization (Lucas Deecke, Iain Murray, Hakan Bilen, 2018)
Python
6
star
86

Kaggle-PKDD-Taxi-I

https://www.kaggle.com/c/pkdd-15-predict-taxi-service-trajectory-i
Python
6
star
87

ssh-failed-attempts

Tool to detect and analyze failed SSH attempts.
Python
5
star
88

Idealwine-wine-prices

API to retrieve quotes from daily wine auctions
Java
5
star
89

notifier

Receive notifications on your phone when your CLI tasks finish.
Python
5
star
90

API-Ratp

API to retrieve real time schedule times for Paris transports
5
star
91

tf-easy-model-saving

An easy way to load and save checkpoints in Tensorflow!
Python
5
star
92

Visual-Ballistic-Roulette-python

Visual Ballistic Roulette written in Python.
Python
5
star
93

Quantitative-Market-Data-Generator

Equity Prices Generator using Quantitative methods such as Brownian Motion
4
star
94

Monte-Carlo-Pi-Computation

This projects aims at computing PI using Monte Carlo method
4
star
95

japanese-sentences-to-vectors

Sentences2vec (sentences to vectors or s2v) algorithm using different papers such as skip-thoughts vectors.
Python
4
star
96

Visual-Ballistic-Roulette-Timer-Android

Timer for Roulette written for Android
Java
4
star
97

EGAIN-pytorch

Python
4
star
98

record-your-internet-speed

Record your internet speed at fixed intervals.
Python
4
star
99

Visual-Ballistic-Roulette-Display-Android

Android App
Java
4
star
100

Martingale-Roulette-MonteCarlo

Monte Carlo simulations for Casino Roulette
MATLAB
4
star