• Stars
    star
    152
  • Rank 239,063 (Top 5 %)
  • Language
    Python
  • License
    MIT License
  • Created over 7 years ago
  • Updated about 6 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

This is a TensorFlow implementation of the WaveNet generative neural network architecture https://deepmind.com/blog/wavenet-generative-model-raw-audio/ for image generation.

A TensorFlow implementation of DeepMind's WaveNet paper

This is a TensorFlow implementation of the WaveNet generative neural network architecture for image generation.

Previous work

Originally, the WaveNet neural network architecture directly generates a raw audio waveform, showing excellent results in text-to-speech and general audio generation (see the DeepMind blog post and paper for details).

This network models the conditional probability to generate the next sample in the audio waveform, given all previous samples and possibly additional parameters.

After an audio preprocessing step, the input waveform is quantized to a fixed integer range. The integer amplitudes are then one-hot encoded to produce a tensor of shape (num_samples, num_channels).

A convolutional layer that only accesses the current and previous inputs then reduces the channel dimension.

The core of the network is constructed as a stack of causal dilated layers, each of which is a dilated convolution (convolution with holes), which only accesses the current and past audio samples.

The outputs of all layers are combined and extended back to the original number of channels by a series of dense postprocessing layers, followed by a softmax function to transform the outputs into a categorical distribution.

The loss function is the cross-entropy between the output for each timestep and the input at the next timestep.

In this repository, the network implementation can be found in wavenet.py.

New approach

This work is based in the implementation of the original WaveNet model (Wavenet), but with some modifications.

That's because we are going to use the WaveNet model as a image generator. We'll use raw pixel data (1D-channel), instead of raw audio files, and once the network is trained, we'll use the conditional probability finded to generate samples (pixels) into an autogenerative process.

Missing features

Currently, there is no conditioning on extra information.

More Repositories

1

chess-alpha-zero

Chess reinforcement learning by AlphaGo Zero methods.
Jupyter Notebook
2,098
star
2

tensorflow-tex-wavenet

This is a TensorFlow implementation of the WaveNet generative neural network architecture https://deepmind.com/blog/wavenet-generative-model-raw-audio/ for text generation.
Python
344
star
3

connect4-alpha-zero

Connect4 reinforcement learning by AlphaGo Zero methods.
Python
110
star
4

muzero

A simple implementation of MuZero algorithm for connect4 game
Jupyter Notebook
91
star
5

Asynchronous-Methods-for-Deep-Reinforcement-Learning

Using a paper from Google DeepMind I've developed a new version of the DQN using threads exploration instead of memory replay as explain in here: http://arxiv.org/pdf/1602.01783v1.pdf I used the one-step-Q-learning pseudocode, and now we can train the Pong game in less than 20 hours and without any GPU or network distribution.
Python
81
star
6

random-memory-adaptation

Random memory adaptation model inspired by the paper: "Memory-based parameter adaptation (MbPA)"
Python
24
star
7

Policy-chess

A Policy Network in Tensorflow to classify chess moves
Python
18
star
8

leela-fish

UCI chess playing engine derived from Stockfish and LeelaChess Zero
C++
16
star
9

pytorch-es-tic-tac-toe

Evolution Strategies in PyTorch (Tic-tac-toe)
Python
13
star
10

mushroom-detector-kerasjs

I explain how to export weights from a Keras model and import those weights in Keras.js, a JavaScript framework for running pre-trained neural networks in the browser. I show you later how to include the final result into a Phonegap Cordova mobile application.
Java
13
star
11

Using-Google-Neural-Machine-Translation-for-chess-movements-inference-TensorFlow-

Using (Google) Neural Machine Translation for chess movements inference (TensorFlow)
Python
3
star
12

schopenhauer_GPT_2

Fine-tuning a GPT-2 pretrained model in the Schopenhauer texts
Jupyter Notebook
2
star
13

FractalExplorationImitationLearning

Using the fragile framework as a memory explorer to train a neural network in Atari games
Jupyter Notebook
2
star
14

chatgpt-slack-bot

Slack Assistant Bot with Image Generation: this Slack Assistant Bot with Image Generation is a powerful AI-driven chatbot that helps users with various tasks within a Slack workspace. Powered by OpenAI GPT, the bot can understand and respond to user messages, as well as generate, edit, and create variations of images based on user prompts.
Python
1
star