Skip to content

Latest commit

 

History

History
57 lines (44 loc) · 3.04 KB

File metadata and controls

57 lines (44 loc) · 3.04 KB

Image-to-speech-conversion-using-raspberry-pi trainyinfi-hardware-badge

In this project, we developed a device that converts an image’s text to speech. The basic framework is an embedded system that captures an image, extracts only the region of interest and converts that text to speech. Live demo here.

Table of Contents

General Information

  • It is implemented using a Raspberry Pi and a Raspberry Pi camera. The captured image undergoes a series of image pre-processing steps to locate only that part of the image that contains the text and removes the background(grayscale). Two tools are used convert the new image (which contains only the text) to speech. They are OCR (Optical Character Recognition) software and TTS (Text-to-Speech) engines. The audio output is heard through the raspberry pi’s audio jack using speaker.
  • OCR ENGINE:

The extraction of the text in the image is done using optical character recognition (OCR). For our project, we have used Tesseract OCR. It is the most accurate open source OCR engine and is powered by google. It can be used on the Linux, mac and windows platform. The newest Tesseract version, 3.4 supports a hundred languages. However, images must undergo a number of pre-processing stages like noise removal, scaling etc. otherwise the output will be of low quality.

  • TTS SOFTWARE:

A text to speech system(TTS) is used to perform speech synthesis. A TTS is composed of two parts: front end and back end. The front end converts the text to a symbol, for example, a number. Each symbol generated is assigned a phonetic. The back end then converts the phonetic into sound. In our project, we have used Festival TTS. Festival is the most widely used open source TTS. It has a wide variety of voices and support English, Spanish and welsh language. We have used the English language.

  • As a part of the software development, the Open CV (Open source Computer Vision) libraries are utilized for image processing.

Block Diagram

Block_diagram

Technologies Used

python pycharm opencv Pi

Output

New Project-min (2)

Project Status

Project is: complete

Created by @EshaMaheshwari @NikhilKakade @SurajJadhav