DETAILS :
This project was created using Convolutional Neural Networks Model to predict the sign language shown by user towards the camera as its respective Natural Language Alphabet (English Alphabets in this project). The project will help individuals with Hearing or Speaking impairments to communicate easily with other people. Python,OpenCV and Tensorflow models and libraries were used during the model training and testing processes. LabelImg tool was used to label the captured dataset images.
The updated code for this project (Main code, Image Capture code and LabelMap code) are uploaded in this repository.
The Dataset used for this project has been uploaded to the Google Drive Link Below: https://drive.google.com/drive/folders/1Q9eEKFZOQydYcYi3-mT9X4fwhyDozz_5?usp=sharing
PROCEDURE :
The proposed system first acquires the images required for training and testing the model with the help of Python and OpenCV and stores in a separate new folder.
The directory where the captured images were stored is selected and the LabelImg tool is used to label and save the sign shown on each of the captured images by drawing detection boxes over the region of interest.
This creates an XML file for each captured image that was labelled. Next, the captured images along with their respective XML files are partitioned into two files, train and test for training and testing the model being created.
Once this is done, different paths are set which are necessary while coding such as image path, model path, annotation path etc.
A Label map is created to represent each Sign Language Symbol. TF records are generated by using the generate_tf_record script.
Then clone the official TensorFlow object detection library from the GitHub repository of TensorFlow Model Zoo which is used for training the model.
Copy the model Configuration from the SSD MobileNet v2 model to the train folder of the proposed system. Update the copied configuration to function efficiently and accordingly for the proposed system.
Train the model for the System proposed using the respective Python scripts for training Deep Learning or Convolutional Neural Networks model using command prompt.
Once training is finished and loss and epoch rates are analyzed to determine the model’s prediction accuracy. Load the trained model on the system.
Finally run the real-time detection code to boot the real-time detection screen which shows the detection box around the sign shown at the camera in real-time with a prediction accuracy rate and the name of the predicted label around the sign shown, on-screen.
RESULTS :
The results obtained consists of a Real-time Object Detection screen using the trained CNN model, to detect different sign language symbols shown by the user at the camera.
The device screen will display a detection box around the region of interest that is, the sign language shown by the user and displays the associated label.
The label displays the alphabet or number associated with the sign language shown towards webcam and the prediction accuracy rate which shows how sure the trained model is that the shown symbol is a particular alphabet or number.
The training process resulted in an average rate of 0.086 loss score for 10,000 epochs. Since the loss score is low, the model is working close to the accurate range.
The prediction accuracy rate for each alphabet or number varied between 96.0% to 99.0% during experimentation.