Font Size: a A A

Dynamic Sign Language Gesture Recognition Based On Lightweight Deep Learning Network

Posted on:2023-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:F Z L HuangFull Text:PDF
GTID:2544306800966579Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Dumb language is one of the main communication methods in the daily life of the deaf community and plays an important role among the deaf community.With the rapid development of deep learning,more and more scholars are devoted to the research of dumb language recognition technology to solve the problem of communication difficulties among deaf people and between deaf people and normal people.However,there are still some problems about the field of dynamic dumb language recognition that remain to be solved.(1)Dumb words are described by a series of complex changing gestures,which have similarity and redundancy among them;moreover,the current mainstream dumb language recognition network models have a large number of parameters,which makes the models very complicated and difficult to achieve lightweight design and real-time requirements.(2)The current dumb utterance translation algorithms have the problems of slow inference speed and difficult convergence of the network model,and it is difficult to comprehensively take into account the long-time temporal information and local feature information.To address problem(1),this paper proposes a dumb isolated word recognition algorithm based on key frame extraction and lightweight neural network,which not only reduces the amount of network input data,but also reduces the computational complexity of the model.This paper designs a key frame extraction algorithm based on convolutional self-encoder optimized clustering to reduce the redundancy in the original input sequence and obtain representative key frames.In this paper,a VTN-C network pair is designed for feature extraction and analysis of dumb sequences to further reduce the size of the model parametric number and achieve recognition of dumb words.In order to verify the effectiveness of the algorithm,this paper is validated on the SLR_Dataset dataset and AUTSL_Dataset dataset.The experimental results show that the method VTN-C network proposed in this paper is able to achieve 93.6% and 91.3% accuracy for dumb isolated words recognition on the two datasets,respectively,compared with other existing methods,achieving excellent recognition results.To address problem(2),this paper proposes a two-way parallel sign language translation model based on CNN and void convolution,called DCC-SLT network.Using the excellent performance of CNN on local feature response in short time sequence and the powerful modeling ability of stacked cavity convolution on long time sequence,the local response information on time sequence and the context information on long time sequence are effectively integrated and the overall convergence speed of the model is improved,and good translation results are obtained.In order to verify the effectiveness of the algorithm,this paper is validated on the RWTH-PHOENIX-Weather 2014 dataset.The experimental results show that the error rate of the DCC-SLT network proposed in this paper is 37.2% for the translation of dumb consecutive utterances,which has a lower error rate than other existing methods,and the number of network model parameters is much smaller than the existing models.
Keywords/Search Tags:Lightweight, Dynamic sign language gesture recognition, VTN-C model, Dynamic sign language gesture translation, DCC-SLT model
PDF Full Text Request
Related items