• Stars
    star
    121
  • Rank 292,239 (Top 6 %)
  • Language
    Python
  • License
    MIT License
  • Created over 6 years ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

takes in a sequence of lip images, and predicts the phonemes being said.

videoToVoice

These files take in a sequence of lip images, and predict the phonemes being said.

pyTubeTest.py takes in a YouTube URL, downloads that video onto the computer, turns the video into an image sequence, tries to find faces in the images, and also extracts the audio from the video and saves that, too. Earlier, we tried to get pyTubeTest.py to also convert the audio into spectrograms with ARSS in the same code, but that just didn’t work because all the libraries required for the first steps only work in Ubuntu, and ARSS only works in Windows.

pyTubeShort.py does the same thing as pyTubeTest.py, but doesn't download the video from YouTube. Instead, it just takes a file from a file directory.

getAudio.py takes in a video file, and saves the audio from that video into a new file.

audioStitcher.py is very simple: it just takes in two audio files, stitches them together, and saves the result.

lipTester.py takes in a sequence of images of faces, and crops each one so that the new folder of images only shows the speaker's lips. (And a margin of 25 pixels or so.)

turnPhonemesToPhoframes.py takes the JSON output the Gentle creates. (This is a time-aligned transcript of what was spoken in the video: e.g., when I said the Bee Movie script, this JSON file has the timestamps at which I said every phoneme of the movie.) Then, it turns that JSON file into phoframes.txt, a text file listing what phoneme is said at every video frame (1/30th of a second)

key.txt tells us what number corresponds with what phoneme, so we can read phoframes.txt more easily!

phoframes.txt tells us what phoneme is being said at every frame of the video. This is the ground truth. And every value is a number, which can be converted back into a phoneme, using key.txt.

phoframeTrain.py creates the neural network architecture, and trains it on processed data. (Note: this code describes the neural network architecture in the most detail.)

phoframeTest.py takes in a pre-trained model, and a sequence of silent images, and generates a text file predicting what phonemes should go along with said video.

rainingDataVisualizer.py

OUTDATED FILES

imageTest.py was an experimental dumping ground of how to use PIL, which I don't think I ended up using.

dump.py is where I tested helper functions such as the spectrogram-smoother, and video-frame-accesser.

faceReadTest.py is where I tested the face recognition library installed from online. It ended up working, but it snaps to the nearest ~30 pixels for some reason, so we decided not to use it for now.

cropper.py crops an image to only show the middle section (middle 40% horizontally and middle 50% vertically), although this is only used for explanation purposes (train.py has a cropping function within it.)

videoGetter.py was a short script we used to extract all the images from a downloaded YouTube video when pyTubeTest.py crashed for some reason.

videoContinue.py is videoGetter’s sequel and does the same thing, but starts in the middle.

trainingDataVisualizer.py was my first attempt at making the pretty bar graphs that show the NN's prediction of phonemes at each frame. The new and improved version is the .pde file.

More Repositories

1

jumpcutter

Automatically edits vidx. Explanation here: https://www.youtube.com/watch?v=DQ8orIurGxw
Python
3,073
star
2

lazykh

Source code for the automatic lip-syncing project described in this video! https://www.youtube.com/watch?v=y3B8YqeLCpY
Python
327
star
3

PrisonersDilemmaTournament

Watch This Place's awesome video about iterated Prisoner's Dilemma for context! https://www.youtube.com/watch?v=BOvAbjfJ0x0
Python
205
star
4

alignedCelebFaces

Better version of my face editing tool. Explanation video here: https://www.youtube.com/watch?v=NTlXEJjfsQU
Python
177
star
5

evolutionSteer

Evolution simulator of creatures learning how to steer towards food pellets better and better.
Processing
94
star
6

VirusGame

Watch here for information about this project: https://www.youtube.com/watch?v=o1IheoDRdGE
Processing
88
star
7

yoyleCityWords

This is a city renderer I created in 2013. It's all hacked together and very slow, so watch out. Explanation video here: https://www.youtube.com/watch?v=y0nsXiI_I9c
Processing
81
star
8

rapLyrics

"Source code" for this video: https://www.youtube.com/watch?v=a0EyfdQ0QTQ I'm only adding 1 file here bc I didn't actually code that much.
Python
75
star
9

recordTrimEdit

Records audio, trims it, and allows you to edit it, all in one fell swoop.
Python
67
star
10

AbacabaCOVID19

A dumping ground of all my COVID-19 visualizations posted here: https://www.youtube.com/user/1abacaba1/videos
Processing
55
star
11

neuralNetworkLanguageDetection

The first neural network I've ever made to improve itself with gradient descent, made on Nov. 20, 2016
Processing
47
star
12

WordleEdit

Multiplayer game that plays a version of Wordle where players can edit their own words as they go. Runs as a Discord bot!
JavaScript
37
star
13

5b

Flash game I developed in early 2013, first announced in this YouTube video: https://www.youtube.com/watch?v=emDpKog8v6w
ActionScript
25
star
14

AbacabaTutorialDrawer

Tutorial Bar graph drawer in Processing. Watch: https://www.youtube.com/watch?v=XCiKO-Qysqk
Processing
19
star
15

rubiksChess

Rubik's Chess. You know the one.
JavaScript
17
star
16

caryCompressionMIDICSV

Watch this video for explanation: https://www.youtube.com/watch?v=SacogDL_4JU
Processing
16
star
17

testRepository

I don't know what I'm doing. AAAAAA
15
star
18

BFDIA7intro

My source code to generate the intro to BFDIA 7: https://www.youtube.com/watch?v=kTcfak9R-ok
Processing
15
star
19

yoyleCity

Here's the original 2013 source code for drawing Yoyle City, which appears at 2:27 in https://www.youtube.com/watch?v=RZB7nTzSl3g&t=207s
Processing
13
star
20

riseGuys

An AmazingRace-inspired version of Doodle Jump I started in 2012, continued in 2020, but never finished.
ActionScript
12
star
21

WCA_SAC

Creates a stacked-area-chart of the Top 100 WCA results, year by year.
Python
12
star
22

chineseCharacterConverter

We're trying to convert traditional Chinese characters to simplified ones.
Python
11
star
23

eclipser

It's the eclipser. Video explanation: https://www.youtube.com/watch?v=NDUh56GAEgM
Processing
9
star
24

HOTSprediction

Predicts the outcomes of Heroes of the Storm games, but not very well. Idea by Adrian
Processing
8
star
25

EvolutionSimulator2D

This is my old evolution simulator from July 2016. I guess that's not THAT long ago...
8
star
26

Bridge_Crossing

Tool to visualize solutions to a puzzle in 3D, as described in this video: https://www.youtube.com/watch?v=y_ii8QT7zsk
Processing
7
star
27

dump

dump
7
star
28

celebrityFaces

My CS229 final project
Python
6
star
29

bfdia10chant

Generates the images seen in BFDIA 10 (https://youtu.be/lcObRZOVdRM?t=887)
Processing
5
star
30

ewow_public_tools

Various tools for EWOW! (like word counter)
Python
5
star
31

Non_Transitive_Dice

video: https://www.youtube.com/watch?v=-fWcB7TAqLE&feature=youtu.be
Processing
4
star
32

Team-11

JavaScript
3
star
33

hthplotter

Head-to-head ao5 plotter for speedcubing. (Check YouTube video linked in readme)
HTML
2
star
34

arduinoLEDs

This comes from one of my video
Arduino
1
star
35

compress_image_sequence

Searches the specified folder for image sequences, then uses ffmpeg to convert them into .mp4's, and then (optionally) deletes the original images.
Python
1
star
36

ewow_elim_sim

Runs simulations to see how many EWOW contestants will be left at each round, on average! (Monte Carlo search)
Python
1
star