Research And Implementation On Sign Language-lip Language Conversion System Based On Monocular Vision

Posted on:2021-03-09

Degree:Master

Type:Thesis

Country:China

Candidate:Z Zhou

Full Text:PDF

GTID:2415330620473749

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

In the language teaching of specialized school for hearing-impaired students,the bilingual teaching mode can effectively improve the language learning efficiency of hearing-impaired children,but it will cost more patience,time and energy for special teachers.Under the present situation that Chinese special education schools are in short of teachers,sign language recognition technology can help special teachers to complete the language teaching task,where deaf students can record the sign language video as the input of a computer,and learn the output,which are Chinese characters and lip language,without much help from the teachers.The personalized teaching can also complete the study of Chinese written language.In addition,the computer only recognizes the standard sign language(based on Chinese Sign Language),which can also correct the sign language dialect of deaf children.This paper studies the sign language � lip language conversion system based on monocular vision.The main issue and difficulties lie in sign language recognition and the specific work is as follows:1.Extraction of video key frames.Firstly,four common video-key-frame extraction methods are analyzed briefly.In order to eliminate redundant frames as many as possible under the premise of extracting complete key frames,a cluster-based video key frame optimization and extraction algorithm is proposed.The depth features of the video frames are extracted by the convolutional self-encoder(CAE)neural network.After the extracted features are clustered by K-means,the clearest video frames are filtered out as the key frames for the initial extraction.The point density method is used for secondary optimization.Experimental results show that the algorithm can eliminate redundant frames in a large amount and ensure the integrity of key frames at the same time.2.Gesture recognition of the key frames.Aiming at the small target of hand,some improvements are applied on SSD target detection network: the weights of important channel is improved by embedding SE-Net into the feature layer of SSD;the imbalance of positive and negative samples is combated by changing the loss function;network training is optimized by mixup and normalization.The experimental results show that the improved SSD has higher recognition accuracy.3.Realization of sign language-lip language conversion system.For the practicality and popularization of the system,the colored sign language video is recorded by a monocular camera as the input of the system.For the purpose of natural expression of sign language,there is no need to wear any equipment or make any mark on people's hands when they are talking by signs.The first output of the system is Chinese characters and pinyin,and the second is lip language video corresponding to Chinese characters.Finally,Vue.js and Spring Boot technology are used to build a web page for displaying the whole system.The users of this system are deaf children.It is hoped that they can learn Chinese,including written and spoken Chinese,by using their familiar sign language,without repetitive teaching of teachers.This system could play a certain auxiliary role on the language teaching of deaf schools.The whole system only needs a monocular camera,with no assistance of other techniques or objects,which makes the system more practical,more popular,and has greater application prospects.

Keywords/Search Tags:

sign language recognition, K-means, key frame extraction, target detection network SSD

PDF Full Text Request

Related items

1	Research On Recognition And Error Detection Technology For Piano Playing Music
2	Research On Sign Language Recognition Method Based On Modal Fusion
3	Research On Chinese Isolated Sign Language Recognition Based On Channel State Information
4	Research On Scene Mongolian Character Detection And Recognition Based On Deep Learning
5	Brush Stroke Extraction Based On BP Neural Network
6	Research And Application Of Automatic Detection Of Thangka Elements Based On Deep Learning
7	Extraction And Generation Of Sketches Of Painted Cultural Relics Based On Deep Learning
8	Music Genre Recognition Research Based On Improved AlexNet
9	Research On Character Recognition Of Xixia Ancient Books Based On Optimization Segmentation And Extraction
10	Research On The Extraction Method Of Music Melody And Its Application