Font Size: a A A

Research On The Construction Of Bilingual Corpus For Agricultural Information

Posted on:2024-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:K R M Y S MoFull Text:PDF
GTID:2543307115469374Subject:Agricultural engineering and information technology
Abstract/Summary:PDF Full Text Request
Xinjiang is one of China’s important agricultural production bases,and the number of people who use Uyghur in agricultural production is large and widespread.Although there are now machine translation systems,and translation model methods are constantly evolving,the accuracy of translation of agricultural terminology has not reached the desired level.Therefore,this paper proposes to build a bilingual Uyghur-Han corpus for agricultural information to improve the sharing and dissemination efficiency of agricultural production expertise.The main research contents are as follows:(1)Collection and preprocessing of bilingual information on agricultural Uyghur and Chinese.This paper studies the use of corpus method combined with information technology means to collect and sort out texts related to the agricultural field from the aspects of literature,news reports,professional dictionaries,daily conversations,etc.,carry out bilingual corpus collection,manually review and proofread the collected corpus,and use professional corpus construction software for statistical analysis and management,on this basis,complete the construction of bilingual corpus for agricultural information.(2)Build a bilingual corpus of agricultural information based on machine learning.In the corpus construction process,text classification and word segmentation techniques are adopted to convert the preprocessed text into a corpus and divide it into different topics and subtopics.In order to improve the quality of the corpus,a three-stage construction method is adopted,including text preprocessing,corpus construction and corpus evaluation.Text preprocessing includes text cleaning,word segmentation,part-of-speech tagging,named entity recognition,etc.,which can make the text quality of the corpus higher and improve the accuracy of information retrieval.The corpus construction stage includes the process of converting the preprocessed text into a corpus,including establishing a corpus structure,storing corpora,etc.The corpus evaluation stage is mainly to evaluate the quality and effectiveness of the corpus,including checking for erroneous and inaccurate translations in the corpus,while also optimizing the performance and functionality of the corpus.(3)Based on.Design and implementation of the Uyghur-Han bilingual agricultural information corpus system of the NET core framework.Based on the construction of a bilingual corpus for agricultural information,this paper designs and implements an agricultural information corpus system.The system can help users search and browse the information in the corpus more conveniently,with good accuracy and practicality,can provide support and assistance for agricultural production,and further promote the construction of agricultural informatization and the promotion of agricultural technology.
Keywords/Search Tags:Uighur Chinese Bilingual, Word alignment, natural language processing, corpu
PDF Full Text Request
Related items