Keras-Classification-Models
A set of models which allow easy creation of Keras models to be used for classification purposes. Also contains modules which offer implementations of recent papers.
NOTE
Since this readme is getting very large, I will post most of these projects on titu1994.github.io
Image Classification Models
Keras Octave Convolutions
Keras implementation of the Octave Convolution blocks from the paper Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution.
Sparse Neural Networks (SparseNets) in Keras
An implementation of "SparseNets" from the paper Sparsely Connected Convolutional Networks in Keras 2.0+.
SparseNets are a modification of DenseNet and its dense connectivity pattern to reduce memory requirements drastically while still having similar or better performance.
Non-Local Neural Networks in Keras
Keras implementation of Non-local blocks from the paper "Non-local Neural Networks".
- Support for "Gaussian", "Embedded Gaussian" and "Dot" instantiations of the Non-Local block.
- Support for shielded computation mode (reduces computation by 4x)
- Support for "Concatenation" instantiation will be supported when authors release their code.
Available at : Non-Local Neural Networks in Keras
Neural Architecture Search Net (NASNet) in Keras
An implementation of "NASNet" models from the paper Learning Transferable Architectures for Scalable Image Recognitio in Keras 2.0+.
Supports building NASNet Large (6 @ 4032), NASNet Mobile (4 @ 1056) and custom NASNets.
Available at : Neural Architecture Search Net (NASNet) in Keras
Squeeze and Excite Networks in Keras
Implementation of Squeeze and Excite networks in Keras. Supports ResNet and Inception v3 models currently. Support for Inception v4 and Inception-ResNet-v2 will also come once the paper comes out.
Available at : Squeeze and Excite Networks in Keras
Dual Path Networks in Keras
Implementation of Dual Path Networks, which combine the grouped convolutions of ResNeXt with the dense connections of DenseNet into two path
Available at : Dual Path Networks in Keras
MobileNets in Keras
Implementation of MobileNet models from the paper MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications in Keras 2.0+.
Contains code for building the MobileNet model (optimized for datasets similar to ImageNet) and weights for the model trained on ImageNet.
Also contains MobileNet V2 model implementations + weights.
Available at : MobileNets in Keras
ResNeXt in Keras
Implementation of ResNeXt models from the paper Aggregated Residual Transformations for Deep Neural Networks in Keras 2.0+.
Contains code for building the general ResNeXt model (optimized for datasets similar to CIFAR) and ResNeXtImageNet (optimized for the ImageNet dataset).
Available at : ResNeXt in Keras
Inception v4 in Keras
Implementations of the Inception-v4, Inception - Resnet-v1 and v2 Architectures in Keras using the Functional API. The paper on these architectures is available at "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning".
The models are plotted and shown in the architecture sub folder. Due to lack of suitable training data (ILSVR 2015 dataset) and limited GPU processing power, the weights are not provided.
Contains : Inception v4, Inception-ResNet-v1 and Inception-ResNet-v2
Available at : Inception v4 in Keras
Wide Residual Networks in Keras
Implementation of Wide Residual Networks from the paper Wide Residual Networks
Usage
It can be used by importing the wide_residial_network script and using the create_wide_residual_network() method. There are several parameters which can be changed to increase the depth or width of the network.
Note that the number of layers can be calculated by the formula : nb_layers = 4 + 6 * N
import wide_residial_network as wrn
ip = Input(shape=(3, 32, 32)) # For CIFAR 10
wrn_28_10 = wrn.create_wide_residual_network(ip, nb_classes=10, N=4, k=10, dropout=0.0, verbose=1)
model = Model(ip, wrn_28_10)
Contains weights for WRN-16-8 and WRN-28-8 models trained on the CIFAR-10 Dataset.
Available at : Wide Residual Network in Keras
DenseNet in Keras
Implementation of DenseNet from the paper Densely Connected Convolutional Networks.
Usage
- Run the cifar10.py script to train the DenseNet 40 model
- Comment out the model.fit_generator(...) line and uncomment the model.load_weights("weights/DenseNet-40-12-CIFAR10.h5") line to test the classification accuracy.
Contains weights for DenseNet-40-12 and DenseNet-Fast-40-12, trained on CIFAR 10.
Available at : DenseNet in Keras
Residual Networks of Residual Networks in Keras
Implementation of the paper "Residual Networks of Residual Networks: Multilevel Residual Networks"
Usage
To create RoR ResNet models, use the ror.py
script :
import ror
input_dim = (3, 32, 32) if K.image_dim_ordering() == 'th' else (32, 32, 3)
model = ror.create_residual_of_residual(input_dim, nb_classes=100, N=2, dropout=0.0) # creates RoR-3-110 (ResNet)
To create RoR Wide Residual Network models, use the ror_wrn.py
script :
import ror_wrn as ror
input_dim = (3, 32, 32) if K.image_dim_ordering() == 'th' else (32, 32, 3)
model = ror.create_pre_residual_of_residual(input_dim, nb_classes=100, N=6, k=2, dropout=0.0) # creates RoR-3-WRN-40-2 (WRN)
Contains weights for RoR-3-WRN-40-2 trained on CIFAR 10
Available at : Residual Networks of Residual Networks in Keras
Neural Architecture Search
Sequentual Halving and Classification
PySHAC is a python library to use the Sequential Halving and Classification algorithm from the paper Parallel Architecture and Hyperparameter Search via Successive Halving and Classification with ease.
Available at : Sequentual Halving and Classification Documentation available at : PySHAC Documentation
Progressive Neural Architecture Search in Keras
Basic implementation of Encoder RNN from the paper ["Progressive Neural Architecture Search"]https://arxiv.org/abs/1712.00559), which is an improvement over the original Neural Architecture Search paper since it requires far less time and resources.
- Uses Keras to define and train children / generated networks, which are defined in Tensorflow by the Encoder RNN.
- Define a state space by using StateSpace, a manager which adds states and handles communication between the Encoder RNN and the user. Submit custom operations and parse locally as required.
- Encoder RNN trained using a modified Sequential Model Based Optimization algorithm from the paper. Some stability modifications made by me to prevent extreme variance when training to cause failed training.
- NetworkManager handles the training and reward computation of a Keras model
Available at : Progressive Neural Architecture Search in Keras
Neural Architecture Search in Keras
Basic implementation of Controller RNN from the paper "Neural Architecture Search with Reinforcement Learning " and "Learning Transferable Architectures for Scalable Image Recognition".
- Uses Keras to define and train children / generated networks, which are defined in Tensorflow by the Controller RNN.
- Define a state space by using StateSpace, a manager which adds states and handles communication between the Controller RNN and the user.
- Reinforce manages the training and evaluation of the Controller RNN
- NetworkManager handles the training and reward computation of a Keras model
Available at : Neural Architecture Search in Keras
Keras Segmentation Models
A set of models which allow easy creation of Keras models to be used for segmentation tasks.
Fully Connected DenseNets for Semantic Segmentation
Implementation of the paper The One Hundred Layers Tiramisu : Fully Convolutional DenseNets for Semantic Segmentation
Usage
Simply import the densenet_fc.py script and call the create method:
import densenet_fc as dc
model = dc.create_fc_dense_net(img_dim=(3, 224, 224), nb_dense_block=5, growth_rate=12,
nb_filter=16, nb_layers=4)
Keras Recurrent Neural Networks
A set of scripts which can be used to add custom Recurrent Neural Networks to Keras.
Neural Algorithmic Logic Units
A Keras implementation of Neural Arithmatic and Logical Unit from the paper Neural Algorithmic Logic Units by Andrew Trask, Felix Hill, Scott Reed, Jack Rae, Chris Dyer, Phil Blunsom.
- Contains the layers for
Neural Arithmatic Logic Unit (NALU)
andNeural Accumulator (NAC)
. - Also contains the results of the static function learning toy tests.
Chrono Initializer, Chrono LSTM and JANET
Keras implementation of the paper The unreasonable effectiveness of the forget gate and the Chrono initializer and Chrono LSTM from the paper Can Recurrent Neural Networks Warp Time?.
This model utilizes just 2 gates - forget (f) and context (c) gates out of the 4 gates in a regular LSTM RNN, and uses Chrono Initialization
to acheive better performance than regular LSTMs while using fewer parameters and less complicated gating structure.
Usage
Simply import the janet.py
file into your repo and use the JANET
layer.
It is not adviseable to use the JANETCell
directly wrapped around a RNN
layer, as this will not allow the max timesteps
calculation that is needed for proper training using the Chrono Initializer
for the forget gate.
The chrono_lstm.py
script contains the ChronoLSTM
model, as it requires minimal modifications to the original LSTM
layer to use the ChronoInitializer
for the forget and input gates.
Same restrictions to usage as the JANET
layer, use the ChronoLSTM
layer directly instead of the ChronoLSTMCell
wrapped around a RNN
layer.
from janet import JANET
from chrono_lstm import ChronoLSTM
...
To use just the ChronoInitializer
, import the chrono_initializer.py
script.
Independently Recurrent Neural Networks (SRU)
Implementation of the paper Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN for Keras 2.0+. IndRNN is a recurrent unit that can run over extremely long time sequences, able to learn the additional problem over 5000 timesteps where most other models fail..
Usage
Usage of IndRNNCells
from ind_rnn import IndRNNCell, RNN
cells = [IndRNNCell(128), IndRNNCell(128)]
ip = Input(...)
x = RNN(cells)(ip)
...
Usage of IndRNN layer
from ind_rnn import IndRNN
ip = Input(...)
x = IndRNN(128)(x)
...
Simple Recurrent Unit (SRU)
Implementation of the paper Training RNNs as Fast as CNNs for Keras 2.0+. SRU is a recurrent unit that can run over 10 times faster than cuDNN LSTM, without loss of accuracy tested on many tasks, when implemented with a custom CUDA kernel.
This is a naive implementation with some speed gains over the generic LSTM cells, however its speed is not yet 10x that of cuDNN LSTMs.
Multiplicative LSTM
Implementation of the paper Multiplicative LSTM for sequence modelling for Keras 2.0+. Multiplicative LSTMs have been shown to achieve state-of-the-art or close to SotA results for sequence modelling datasets. They also perform better than stacked LSTM models for the Hutter-prize dataset and the raw wikipedia dataset.
Usage
Add the multiplicative_lstm.py
script into your repository, and import the MultiplicativeLSTM layer.
Eg. You can replace Keras LSTM layers with MultiplicativeLSTM layers.
from multiplicative_lstm import MultiplicativeLSTM
Minimal RNN
Implementation of the paper MinimalRNN: Toward More Interpretable and Trainable Recurrent Neural Networks for Keras 2.0+. Minimal RNNs are a new recurrent neural network architecture that achieves comparable performance as the popular gated RNNs with a simplified structure. It employs minimal updates within RNN, which not only leads to efficient learning and testing but more importantly better interpretability and trainability
Usage
Import minimal_rnn.py and use either the MinimalRNNCell or MinimalRNN layer
from minimal_rnn import MinimalRNN
# this imports the layer rather than the cell
ip = Input(...) # Rank 3 input shape
x = MinimalRNN(units=128)(ip)
...
Nested LSTM
Implementation of the paper Nested LSTMs for Keras 2.0+. Nested LSTMs add depth to LSTMs via nesting as opposed to stacking. The value of a memory cell in an NLSTM is computed by an LSTM cell, which has its own inner memory cell. Nested LSTMs outperform both stacked and single-layer LSTMs with similar numbers of parameters in our experiments on various character-level language modeling tasks, and the inner memories of an LSTM learn longer term dependencies compared with the higher-level units of a stacked LSTM
Usage
from nested_lstm import NestedLSTM
ip = Input(shape=(nb_timesteps, input_dim))
x = NestedLSTM(units=64, depth=2)(ip)
...
Keras Modules
A set of scripts which can be used to add advanced functionality to Keras.
Switchable Normalization for Keras
Switchable Normalization is a normalization technique that is able to learn different normalization operations for different normalization layers in a deep neural network in an end-to-end manner.
Keras port of the implementation of the paper Differentiable Learning-to-Normalize via Switchable Normalization.
Code ported from the switchnorm official repository.
Note
This only implements the moving average version of batch normalization component from the paper. The batch average technique cannot be easily implemented in Keras as a layer, and therefore it is not supported.
Usage
Simply import switchnorm.py and replace BatchNormalization layer with this layer.
from switchnorm import SwitchNormalization
ip = Input(...)
...
x = SwitchNormalization(axis=-1)(x)
...
Group Normalization for Keras
A Keras implementation of Group Normalization by Yuxin Wu and Kaiming He.
Useful for fine-tuning of large models on smaller batch sizes than in research setting (where batch size is very large due to multiple GPUs). Similar to Batch Renormalization, but performs significantly better on ImageNet.
As can be seen, GN is independent of batchsize, which is crucial for fine-tuning large models which cannot be retrained with small batch sizes due to Batch Normalization's dependence on large batchsizes to compute the statistics of each batch and update its moving average perameters properly.
Usage
Dropin replacement for BatchNormalization layers from Keras. The important parameter that is different from BatchNormalization
is called groups
. This must be appropriately set, and requires certain constraints such as :
- Needs to an integer by which the number of channels is divisible.
1 <= G <= #channels
, where #channels is the number of channels in the incomming layer.
from group_norm import GroupNormalization
ip = Input(shape=(...))
x = GroupNormalization(groups=32, axis=-1)
...
Normalized Optimizers for Keras
Keras wrapper class for Normalized Gradient Descent from kmkolasinski/max-normed-optimizer, which can be applied to almost all Keras optimizers.
Partially implements Block-Normalized Gradient Method: An Empirical Study for Training Deep Neural Network for all base Keras optimizers, and allows flexibility to choose any normalizing function. It does not implement adaptive learning rates however.
Usage
from keras.optimizers import Adam, SGD
from optimizer import NormalizedOptimizer
sgd = SGD(0.01, momentum=0.9, nesterov=True)
sgd = NormalizedOptimizer(sgd, normalization='l2')
adam = Adam(0.001)
adam = NormalizedOptimizer(adam, normalization='l2')
Tensorflow Eager with Keras APIs
A set of example notebooks and scripts which detail the usage and pitfalls of Eager Execution Mode in Tensorflow using Keras high level APIs.
One Cycle Learning Rate Policy for Keras
Implementation of One-Cycle Learning rate policy from the papers by Leslie N. Smith.
- A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay
- Super-Convergence: Very Fast Training of Residual Networks Using Large Learning Rates
Batch Renormalization
Batch Renormalization algorithm implementation in Keras 1.2.1. Original paper by Sergey Ioffe, Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models.\
Usage
Add the batch_renorm.py
script into your repository, and import the BatchRenormalization layer.
Eg. You can replace Keras BatchNormalization layers with BatchRenormalization layers.
from batch_renorm import BatchRenormalization
Snapshot Ensembles in Keras
Implementation of the paper Snapshot Ensembles
Usage
The technique is simple to implement in Keras, using a custom callback. These callbacks can be built using the SnapshotCallbackBuilder class in snapshot.py. Other models can simply use this callback builder to other models to train them in a similar manner.
- Download the 6 WRN-16-4 weights that are provided in the Release tab of the project and place them in the weights directory
- Run the train_cifar_10.py script to train the WRN-16-4 model on CIFAR-10 dataset (not required since weights are provided)
- Run the predict_cifar_10.py script to make an ensemble prediction.
Contains weights for WRN-CIFAR100-16-4 and WRN-CIFAR10-16-4 (snapshot ensemble weights - ranging from 1-5 and including single best model)
Available at : Snapshot Ensembles in Keras