There are no reviews yet. Be the first to send feedback to the community and the maintainers!
Speech
GAN_based_TTS
The GAN is a very powerful technique works on the functionality of generator G and discriminator D based on game theory. This involves the network of the generator which maps and estimate the input features of the samples The other one is discriminator that tries to find the closest match for the generated sample to that of the original sample and identifies the dissimilarities between the two. So we can say that the generator is described to fool the discriminator. The Generator generates the linguistic features of the given text and discriminator optimizes the original feature vector and generated the feature vectorAudio-Feature-Extraction
In sound processing, the mel-frequency cepstrum (MFC) is a representation of the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency. Mel-frequency cepstral coefficients (MFCCs) are coefficients that collectively make up an MFC.Stroke-Prediction
Machine Learning is the fastest-growing technique in many fields and the healthcare industry is no exception to this. Machine Learning algorithms plays an essential role in predicting the presence/absence of Heart diseases, tumors, and more. Such required information, if predicted well in advance, can provide important insights to doctors who can then adapt their diagnosis and treat the patient accordingly. World Health Organization has estimated 12 million deaths occur worldwide, every year due to heart diseases. Half the deaths in the United States and other developed countries are due to cardiovascular diseases. The early prognosis of stroke diseases can aid in making decisions on lifestyle changes in high-risk patients and in turn reduce the complications. If it is about to identify the relationship and factors affecting it can cured n advance time. This research intends to pinpoint the most relevant/risk factors of heart disease as well as predict the overall risk using logistic regression. In this report, I'll discuss the prediction of stroke using Machine Learning algorithms. The algorithm I have implemented is logistic regression on the HealthGammatone-like-spectrograms
Gammatone filters are a popular linear approximation to the filtering performed by the ear. This routine provides a simple wrapper for generating time-frequency surfaces based on a gammatone analysis, which can be used as a replacement for a conventional spectrogram. It also provides a fast approximation to this surface based on weighting the output of a conventional FFT.Image-Caption-using-CNNs-and-RNNs-
Image Caption Generator using CNNs and RNNsΒΆHTK-features-in-Python
HTK features in Python This project contains a Python implementation of the MFCC features as computed by HTK.End-to-End-Neural-Diarization
Matlab-Voice-Record-and-plot-FFT-Real-TimeSpeech-Processing-Basic-Concepts
Basic Concepts: Articulatory Phonetics β the development and classification of speech sounds; Acoustic Phonetics β the acoustics of speech production; Review of Digital Signal Processing concepts; Short-Time Fourier Transform, Filter-Bank, and LPC Methods Techniques for Speech Analysis: Features, Feature Extraction, and Pattern Comparison: Log Spectral Distance, Cepstral Distances, Weighted Cepstral Distances and Filtering, Likelihood Distortions, Spectral Distortion using a Warped Frequency Scale, LPC, PLP, and MFCC Coefficients are both statistical and perceptual speech distortion measures. Multiple Time β Alignment Paths, Dynamic Time Warping, and Time Alignment and Normalization RemarksTextPrediction
Recent Google and Facebook focused on behind-the-scenes mechanisms of text prediction. In addition to using Recurrent Neural Network and Long Short-Term Memory Networks for the motivation, there were two word2vec models for generating word embeddings also discussed.Love Open Source and this site? Check out how you can help us