Sangramsingkayte/GAN_based_TTS

Stars
7
Rank 2,294,772 (Top 46 %)
Language
Jupyter Notebook
Created over 6 years ago
Updated over 3 years ago

Sangramsingkayte/GAN_based_TTS

Sangramsingkayte

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

The GAN is a very powerful technique works on the functionality of generator G and discriminator D based on game theory. This involves the network of the generator which maps and estimate the input features of the samples The other one is discriminator that tries to find the closest match for the generated sample to that of the original sample and identifies the dissimilarities between the two. So we can say that the generator is described to fool the discriminator. The Generator generates the linguistic features of the given text and discriminator optimizes the original feature vector and generated the feature vector

Speech

Jupyter Notebook

Audio-Feature-Extraction

In sound processing, the mel-frequency cepstrum (MFC) is a representation of the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency. Mel-frequency cepstral coefficients (MFCCs) are coefficients that collectively make up an MFC.

Jupyter Notebook

Speech-Synthesis-System

Language is the structural form of sharing thoughts and emotions in humans. The research motivates to stroke up for the Human-computer interaction. The overall intention of my PhD research program is focused to design Concatenation and Hidden Markov Model (HMM) based speech synthesis for the Marathi language. This will facilitate to correspond to the system and extend the technology for assertive devices based on the Marathi language. The advantage and attractive feature of the HMM system are that the voice alteration can be performed without large databases. To understand the detailed study of Synthesis techniques, I have also implemented the system for Unit Selection method. The Marathi Talking calculator is published at Play store using the technique of concatenation. This calculator performs the basic arithmetic operations and additionally speaks out the numeral in Marathi as the key is pressed. The result box synthesis the voice and speaks out the result in Marathi with correct place value of digits. The weakness of USS is it requires a large database and at joins, the quality is affected. To overcome these issues, the study reveals the built-up of a system with a phonetic based approach for Marathi Language using Concatenation and HMM.

Stroke-Prediction

Machine Learning is the fastest-growing technique in many fields and the healthcare industry is no exception to this. Machine Learning algorithms plays an essential role in predicting the presence/absence of Heart diseases, tumors, and more. Such required information, if predicted well in advance, can provide important insights to doctors who can then adapt their diagnosis and treat the patient accordingly. World Health Organization has estimated 12 million deaths occur worldwide, every year due to heart diseases. Half the deaths in the United States and other developed countries are due to cardiovascular diseases. The early prognosis of stroke diseases can aid in making decisions on lifestyle changes in high-risk patients and in turn reduce the complications. If it is about to identify the relationship and factors affecting it can cured n advance time. This research intends to pinpoint the most relevant/risk factors of heart disease as well as predict the overall risk using logistic regression. In this report, I'll discuss the prediction of stroke using Machine Learning algorithms. The algorithm I have implemented is logistic regression on the Health

Jupyter Notebook

Gammatone-like-spectrograms

Gammatone filters are a popular linear approximation to the filtering performed by the ear. This routine provides a simple wrapper for generating time-frequency surfaces based on a gammatone analysis, which can be used as a replacement for a conventional spectrogram. It also provides a fast approximation to this surface based on weighting the output of a conventional FFT.

Image-Caption-using-CNNs-and-RNNs-

Image Caption Generator using CNNs and RNNs¶

HTK-features-in-Python

HTK features in Python This project contains a Python implementation of the MFCC features as computed by HTK.

Jupyter Notebook

End-to-End-Neural-Diarization

Matlab-Voice-Record-and-plot-FFT-Real-Time

Speech-Processing-Basic-Concepts

Basic Concepts: Articulatory Phonetics – the development and classification of speech sounds; Acoustic Phonetics – the acoustics of speech production; Review of Digital Signal Processing concepts; Short-Time Fourier Transform, Filter-Bank, and LPC Methods Techniques for Speech Analysis: Features, Feature Extraction, and Pattern Comparison: Log Spectral Distance, Cepstral Distances, Weighted Cepstral Distances and Filtering, Likelihood Distortions, Spectral Distortion using a Warped Frequency Scale, LPC, PLP, and MFCC Coefficients are both statistical and perceptual speech distortion measures. Multiple Time – Alignment Paths, Dynamic Time Warping, and Time Alignment and Normalization Remarks

Jupyter Notebook

TextPrediction

Recent Google and Facebook focused on behind-the-scenes mechanisms of text prediction. In addition to using Recurrent Neural Network and Long Short-Term Memory Networks for the motivation, there were two word2vec models for generating word embeddings also discussed.

Jupyter Notebook