Font Size: a A A

Research On Construction And Prediction Of Spanish Pronunciation Dictionary For Military Field

Posted on:2021-12-20Degree:MasterType:Thesis
Country:ChinaCandidate:J G ZhaoFull Text:PDF
GTID:2415330647457245Subject:Foreign Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
In recent years,with the continuous advancement of technologies such as artificial intelligence and big data,natural language processing with data-driven methods has been more widely used.This tesis attempts to use a data-driven approach to predict the pronunciation of Spanish vocabulary,thereby completing the rapid construction of a Spanish pronunciation dictionary.In the Spanish speech synthesis system and speech recognition system,the pronunciation dictionary is an important basic resource for carrying vocabulary pronunciation information.The labeling accuracy and scale of the pronunciation dictionary will directly affect the performance of the entire system.Spanish is an inflectional language,which relies on morphological changes to reflect different persons,tenses,voices,singular and plural numbers,and parts of speech.The large number of morphological changes in Spanish makes the number of words to be recognized in the intelligent speech processing process increase sharply.At the same time,Spanish will continue to produce new words,which makes it difficult for the pronunciation dictionary to cover all words.The use of grapheme-to-phoneme conversion(G2P)technology to realize automatic labeling can effectively solve the problem of missing words in the pronunciation dictionary collection.Research on Spanish grapheme-to-phoneme conversion technology is still relatively scarce at China and abroad.This article will study the pronunciation rules of Spanish vocabulary,manually label small-scale pronunciation dictionaries,and realize the automatic prediction function of pronunciation dictionary by using data-driven word-to-speech technology.The main results of this research are as follows:(1)Study the pronunciation characteristics of Spanish words,design and formulate a Spanish phoneme set containing 44 phonemes,manually mark and check,construct a general Spanish pronunciation dictionary covering 91040 entries.(2)Select specific Spanish military websites,and build a Spanish military vocabulary set covering 22416 entries by directly downloading the vocabulary and using python programming for web crawling and filtering.(3)By studying and comparing different statistical models and neural network models,an end-to-end vocabulary transcription model based on character embedding + recurrent neural network(RNN)+ connectionist temporal classification(CTC)is proposed.Through this model,the pronunciation prediction of the Spanish military vocabulary was completed,and its accuracy rate reached 91.88%.(4)Based on the streamlit component under Python,a Spanish military vocabulary pronunciation prediction platform based on the B/S architecture was built,which can intuitively display the structure of the built model and use the pronunciation prediction function.This is helpful for the promotion and application of research results.In summary,this research uses neural network models to model a small-scale Spanish pronunciation dictionary,realizes the pronunciation prediction of Spanish military vocabulary,and completes the rapid construction of Spanish pronunciation dictionary.
Keywords/Search Tags:Spanish, Pronunciation Dictionary, Grapheme-to-Phoneme Conversion, Recurrent Neural Network, Connectionist Temporal Classification
PDF Full Text Request
Related items