Research And Implementation Of Uyghur-Chinese Translation Software Based On Optical Character Recognition

Posted on:2019-06-27

Degree:Master

Type:Thesis

Country:China

Candidate:E D T L J Mai

Full Text:PDF

GTID:2428330566466613

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

Nowadays,with the rapid development of social science and technology,the popularity of smart phones has risen in a straight line due to the convenience of mobile phones being portable and functions have become increasingly sophisticated,and the scale of mobile Internet users has become larger.Nowadays at home and abroad,the OCR recognition technology for English,Chinese and other languages is quite mature,and the use of OCR technology for text translation in a certain language has become more and more popular.However,in Xinjiang,OCR technology is used to realize the identification and translation of Uighur texts.Research is not yet mature.Therefore,researching Uighur OCR technology and machine translation technology will play an active role in Xinjiang's economic construction,cultural exchanges among all ethnic groups,and accelerating the development of Uighur text information.This paper mainly studies the UCR(Ukraine OCR)and UW statistical machine translation technologies,and trains the Uyghur graphic recognition training model on the Tessetact-OCR platform,and uses it as the basis to develop the Uygur optical character on the Android platform.The integrated application of recognition and translation realizes the recognition of text information and real-time translation from Uyghur images.Firstly,in terms of Uyghur character-based image recognition,the system uses local adaptive threshold binarization and morphological closed operations to perform image processing algorithms to preprocess the target image and improve the recognition success rate of Tessetact-OCR.The scaled watershed segmentation algorithm segmented the Uyghur images and then used the Tesseract engine to train Uighur.Then,in terms of vocabulary storage and translation,49,000 Uyghur words and parallel sentence pairs were prepared.The NiuTrans Server tool kit was used to build the Uyghur-Chinese translation system,and the translation function was provided on the Azure cloud platform to provide APIs for the client,and finally Java was used.Language Android client integrated development environment to achieve Android client.

Keywords/Search Tags:

OCR, Azure cloud platform, Android, Tesseract, Machine statistics translation

PDF Full Text Request

Related items

1	Research On Application Of Storage System Based On Microsoft Cloud Platform Microsoft Azure
2	The Design And Implementation Of E-Invoice Platform Based On Tesseract
3	Research And Implementation Of Data Synchronization Based On Cloud Computing
4	Application Of Machine Translation In RS10 Cloud Platform Products
5	Design And Implementation Of Report Sharing System Based On Microsoft Azure Cloud Platform
6	Design And Implementation Of Portable Translation Tool Based On Cloud Service On Android Platform
7	Design And Implementation Of Mobile Phone Translation Software Based On Tesseract
8	The Research And Application Of Equipment Remote Monsitoring Technology Based On Cloud Platform
9	Design And Implementation Of ERP Platform For Small And Medium Enterprises Based On Windows Azure
10	Transformation And Realization Of Online Shopping System Based On Windows Azure Platform