Audio Narration of a Scene for Visually Disabled using Smart Goggle
Keywords:Raspberry Pi, Tesseract OCR engine, Raspberry Pi camera board, OpenCV, Natural Language Processing, Natural Language Generation, Text to Speech (TTS) engine, Optical Character Recognition (OCR), Object detection
This work supports visually disabled people to get an idea of what is in the captured image. By using different kinds of multimedia information processing techniques, the proposed device will first acquire image attributes via Pi Camera, then perform an image to text conversion using Tesseract library and OpenCV library. Previously proposed approaches used computer vision technology to determine labels or exploit already available descriptions of the training images to transfer or compose a completely new description for the image to be tested. Now we propose an approach that will use image annotations to generate image descriptions and shows that with the accurate object and attribute detection, human-like descriptions for images can be generated. We use TTS (Text to Speech) for text to speech transformation and Python programming language.
How to Cite
Copyright (c) 2022 Pratyush Pratap Singh, Sharath S. Hegde, R. Varun, Vivek Hegde, K. A. Sumithra Devi
This work is licensed under a Creative Commons Attribution 4.0 International License.