Speech-Hacker (New Release)
Would you like to make any famous figure speak whatever you want? Use Speech-Hacker to train your own speaker and receive speeches spoken by them.
Description
Speech-Hacker takes a large data base of audio speeches spoken by your chosen figure and employs Simple Audio Indexer (Using Watson Speech API) to split them on words and to create smaller chunks of audio files containing those words. Finally, your desired speech's words and phrases get associated with audio chunks that were created and converted, so that you can receive a brand new speech spoken by your figure.
Significant Improvement in New Version
In earlier versions, Speech-Hacker was using pydub to split words based on amount of silence between them. That worked, but it wasn't as smart as we wanted to be. Therefore, we thought of using IBM Watson Speech API to index words. Of course, we got much better results! SimpleAudioIndexer was built as a separate project to help us implement this functionality for Speech-hacker.
Samples (Trained by President Obama's Speeches)
"Let me give you an example. When I was in Washington, I fought hard to make sure your rights are under safety, like social security and other things."
"Black Friday was the best thing in the world! Did you go shopping?"
"American people now should care about science more than before since we have problems in this country."
"Call me on your phone so we can talk about important issues."
"I want a better job after this because this one did not make me sick enough!
"Make america great again and long live alt right! I am not sure of course what I just said!""
Get Started
Dependencies
- Python 2.7
- Simple Audio Indexer
- IBM Watson Speech API Username and Password
- pydub
Install
pip install Speech-Hacker
Setup
- Choose your figure(s).
- Browse the internet to find reasonable amount of relatively good quality audio files spoken by your figure. (Convert them to WAV)
- Place all the audio files you found in a folder
- Acquire IBM Watson Speech to Text username and password at https://www.ibm.com/watson/developercloud/speech-to-text.html (For help visit: Here)
Usage
Command for training a model
Speech-Hacker -train -u IBM_USERNAME -p IBM_PASSWORD -d ABS_PATH_TO_YOUR_AUDIO_FILES_FOLDER
Command for generating your custom speech
Speech-Hacker -generate -d ABS_PATH_TO_TRAINED_MODEL -t "WHAT_YOU WANT_TO_SAY" -g DESTINATION_FOR_REQUESTED_AUDIO
If you would like to generate from a text file, you can alternatively enter:
Speech-Hacker -generate -d ABS_PATH_TO_TRAINED_MODEL -f "ABS_PATH_OF_TEXT_FILE" -g DESTINATION_FOR_REQUESTED_AUDIO
Arguments Description:
-train
: Training mode
IBM_USERNAME
: IBM Watson Speech to Text username
IBM_PASWORD
: IBM Watson Speech to Text password
ABS_PATH_TO_YOUR_AUDIO_FILES_FOLDER
: Absolute path to the folder
you placed the audio of your figure
-generate
: Generating mode
ABS_PATH_TO_TRAINED_MODEL
: Absolute path of the folder of audios
you entered when training.
DESTINATION_FOR_REQUESTED_AUDIO
: The destination you would like to
export your generated audio to.
Thanks
Many thanks to the following GitHub users for contributing code and/or ideas: