Font Size: a A A

Reading Digital Video Clocks Using Deep Learning

Posted on:2022-11-10Degree:MasterType:Thesis
Country:ChinaCandidate:Z P ChenFull Text:PDF
GTID:2558306347951029Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
How to effectively manage and utilize massive amounts of video media data has always been an active research problem.The clock information in the video is the key information in the video media data.Automatically reading the clock information in the video can provide convenience for the management and utilization of the video media data.However,as a special type of video text,clock numbers have relatively low resolution and a variety of colors and sizes.This makes traditional clock reading algorithms based on heuristic learning poorer in robustness and generalization.Weak,only suitable for a relatively single video clock type.When the video clock type changes,there are relatively many threshold parameters that need to be manually adjusted manually.In recent years,human beings have entered the era of artificial intelligence 2.0.This round of artificial intelligence revolution is mainly based on the development of deep learning technology.Compared with machine learning algorithms,driven by a large amount of data,deep learning gradually shows its superior robustness and generalization performance.For this reason,this article will mainly study how to apply deep learning related technologies to clock reading problems to overcome the poor robustness and weak generalization ability of traditional clock reading algorithms.Clock reading based on traditional heuristic algorithms is generally divided into two steps:positioning and recognition.The difference is that the algorithm used in this article divides the clock recognition into:clock initial region location,clock time region location and clock time recognition.Based on this,the research work of this article is mainly divided into three parts:(1)In the initial region location of the clock,this paper extracts the new features of the second pixel,and uses the PRN(Pixel Recongnition Network)based on the convolutional neural network to locate the initial region of the clock.Compared with traditional heuristic algorithms,PRN is more robust and applicable to a wider range of videos.When the video duration does not exceed 5s,it can still capture the single-bit-second pixel area of different types of video clocks.Its second pixel accuracy rate on the test set is 0.9895.(2)In the clock time area positioning,this article uses the deep learning-based target positioning algorithm to locate the clock time area.Compared with the previous algorithm to locate each character of the clock time one by one,the algorithm in this paper is to directly locate the clock time area frame,without other additional image processing and analysis steps,and is more robust.The experimental results show that the method in this paper can accurately locate the color,size,background,position,and different clock time regions,and the avg-iou value on the test set reaches 0.9426.(3)In the clock time recognition,this paper uses the convolutional neural networkbased clock time end-to-end CTRN(Clock Time Recongition Network)model to recognize the clock time.Unlike the previous recognition of each clock time character one by one,CTRN is an end-to-end clock time recognition network,which directly recognizes the current time in the clock time area without additional image analysis processing steps,and is more robust.The experimental results show that the accuracy rate of CTRN’s clock time recognition on the validation set is 0.9985,and the recognition accuracy rate on the additional hybrid hybrid clock time data set is 0.9957.
Keywords/Search Tags:Video clock, clock time localization, clock time recongition, convolutional neural network
PDF Full Text Request
Related items