SoundStream: An End-to-End Neural Audio Codec
This repository is an implementation of the article with same name.
The RVQ (stands for Residual Vector Quantizer) relies on lucidrains' repository.
I built this implementation to serve my needs and some features are missing from the original article.
Missing pieces
- Denoising: this implementation is not built to denoise, so there is no conditioning signal nor Feature-wise Linear Modulation blocks.
- Bitrate scalability: for now, quantizer dropout has not been implemented.
Citations
@misc{zeghidour2021soundstream,
title = {SoundStream: An End-to-End Neural Audio Codec},
author = {Neil Zeghidour and Alejandro Luebs and Ahmed Omran and Jan Skoglund and Marco Tagliasacchi},
year = {2021},
eprint = {2107.03312},
archivePrefix = {arXiv},
primaryClass = {cs.SD}
}