Design And Implementation Of Online Video Communication System Based On Audio-Visual Speech Enhancement

Posted on:2024-02-13

Degree:Master

Type:Thesis

Country:China

Candidate:J N Yin

Full Text:PDF

GTID:2568306941489754

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

In recent years,under the trend of economic globalization,with the widespread cooperation and telecommuting in different places and the comprehensive coverage of high-speed mobile networks,online video communication is gradually becoming an irreplaceable part of people’s work and life.However,traditional speech enhancement algorithms widely used in online video communication systems suffer from inadequate speech noise reduction performance,and the shortcomings of such systems in terms of user experience in noisy environments are becoming increasingly apparent.Under this background,by analyzing the advantages and disadvantages of various speech enhancement algorithms,this paper studies the application of deep learning-based audio-visual speech enhancement method in video communication system,so as to make full use of the visual information carried by video communication and further improve the speech enhancement performance of traditional WebRTC communication system.The topic chosen for the paper has both theoretical research and practical application significance.First,this paper discusses the shortcomings of existing audiovisual speech enhancement methods applied to video communication scenarios.To address the instability of visual features in video communication,this paper introduces lip sync methods as visual features and designs an automatic feature switching mechanism.In addition,this paper further improves the switching mechanism by introducing a non-intrusive speech quality evaluation method.Tested in various scenarios,the proposed method in this paper shows better speech enhancement performance than the existing algorithm.Secondly,the implementation of the existing audio-visual speech enhancement algorithm is mainly oriented to offline scenarios and cannot directly adapt to real-time audio processing needs.This paper designs and implements an online audio-visual speech enhancement media processing pipeline based on the GStreamer media processing framework to adapt to streaming transmission,which ensures the efficient performance of audiovisual speech enhancement algorithms in video communication systems.Finally,based on the proposed algorithm and processing pipeline,this paper designs and implements an online video communication system.The system is compatible with the WebRTC communication protocol and introduces the proposed online audio-visual speech enhancement processing pipeline through the SFU architecture to provide an audiovisual speech enhancement technology-enhanced noise reduction experience.Tests show that the system has significantly improved the speech enhancement performance compared with the native WebRTC communication system and meets the real-time requirements in terms of processing delay and other indicators,demonstrating the feasibility of the application of audio-visual speech enhancement technique in video communication and the research value of the system in enhancing speech quality and improving user experience.

Keywords/Search Tags:

audio-visual speech enhancement, media process pipeline, streaming technology, video communication

PDF Full Text Request

Related items

1	Design And Implementation On Audio-video Synchronization In Streaming-media System
2	Speech Endpoint Detection Based On Audio And Visual Features
3	Audio And Video Synchronization Research Of Video Chat System
4	Digital Organisms Mid-stream Media Data Receiving And Processing
5	Research On Speech Separation Based On Visual Assistance
6	Quality Evaluation And Enhancement Of Immersive Visual Media Experience
7	The Design And Implementation Of Streaming Media Client Based On Android
8	Design And Realization Of Audio/Video Broadcasting And On Demand System Based On Streaming Media Technology
9	Digital Video System Of Court Trial Based On Streaming Media
10	Design And Implementation Of Embedded P2P Streaming Media Server