LiptoSpeech

Lip reading using End to End Sentence Level Model

Problem Statement:

Lipreading is the task of decoding text from the movement of a speaker’s mouth. Traditional approaches separated the problem into two stages: designing or learning visual features, and prediction

Input : A Video file of a person speaking some word or phrase.
Output : The predicted word or phrase the person was speaking.

Dataset:

GRID-Corpus - http://spandh.dcs.shef.ac.uk/gridcorpus/
LRW - https://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrw1.html

Technologies and frameworks:

- Tensorflow 1.2.1
- Keras
- Opencv3
- python 3.6

Preprocess the dataset:

python Videoprocess.py id2_vcd_swwp2s.mpg

Dlib Predictor Model is used to landmark the facial points which can be found in predictor directory predictor/shape_predictor_68_face_landmarks.dat.bz2

MouthExtract folder contains the preprocessed dataset

Prediction:

python predict.py <path to the video>
Example: python predict.py PredictVideo/patrick.m4v

Important:

Please note that the video should be in 25 fps for the model to work.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
GUI		GUI
LipPredict		LipPredict
MouthExtract		MouthExtract
Training		Training
mouth_extract		mouth_extract
predictor		predictor
ExtractMouth.ipynb		ExtractMouth.ipynb
README.md		README.md
requirements.txt		requirements.txt
videoprocess.py		videoprocess.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LiptoSpeech

Problem Statement:

Dataset:

Technologies and frameworks:

Preprocess the dataset:

Prediction:

Important:

About

Releases

Packages

Languages

PatrickPrakash/LiptoSpeech

Folders and files

Latest commit

History

Repository files navigation

LiptoSpeech

Problem Statement:

Dataset:

Technologies and frameworks:

Preprocess the dataset:

Prediction:

Important:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages