Font Size: a A A

Research On The Construction Of An Automatic Scoring Model For The Topic Of Viewing Pictures And Speaking In Chinese As A Foreign Language

Posted on:2020-09-10Degree:MasterType:Thesis
Country:ChinaCandidate:G C TangFull Text:PDF
GTID:2435330578478291Subject:The modern education technology
Abstract/Summary:PDF Full Text Request
This study takes the question type of picture description in HSKK(intermediate)as an example,and uses advanced intelligent speech and natural language processing techniques to extract scoring features that can effectively evaluate the question type of picture description,then constructs the scoring model of picture description question type and verify its validity through regression analysis.Firstly,it analyzes the question type characteristics,examination requirements and scoring standards of the picture description,and divides the scoring features into three aspects:content relevance,expression fluency and grammatical accuracy.Features in terms of content relevance include keyword coverage rate and amount of language output.Characteristics of expression fluency include pronunciation,the frequency of pauses,and the times of repetitions and corrections.The grammatical accuracy is characterized by the number of grammatical errors.Secondly,advanced intelligent speech technology and natural language processing technology are used to extract scoring features.In calculating the keyword coverage rate,Tencent AI's keyword retrieval technology was used to calculate the keyword coverage rate by the formula kcr=m/n.In the calculation of the amount of language output,Tencent AI's long speech recognition technology is used to convert the candidate's answering speech into text,and then the converted text is properly proofread,and finally the number of words is counted to obtain the candidate's amount of language output.When the candidate's pronunciation standard level is obtained,the previous steps are the same as the calculation of the language quantity.Finally,the voice evaluation technology of IFLYTEK is used to obtain the candidate's pronunciation standard degree value.When calculating the pause frequency,firstly use the endpoint detection technology based on the short-time energy and zero-crossing rate double threshold to cut the voiced segment and the silent segment apart in the answering voice,and then count the number of silent segments(except the first and last pauses)and the total duration of the pronunciation,and finally the number of pauses per minute to indicate the pause frequency.Due to the complexity of repetition and correction in spoken language,the times of repetitions and corrections is mainly obtained by means of manual marking.When obtaining a grammatical error,the main technique used to convert the speech into text and then the grammatical error detection of the text is the"Xiaohongbi" text automatic proofreading technique.Finally,build an automatic scoring model.First,70 answer speech data were collected and randomly divided into two groups:construction group(50)and inspection group(20).Based on the data of the construction group,the average score of the three scorers was used as the dependent variable,and the score feature was extracted as the independent variable.And the multivariate stepwise linear regression analysis method was used for regression analysis.In the end,there were four scoring features in the regression equation:keyword coverage rate(kcr),amount of language output(nwords),repetition and correction times(rac),the number of grammatical errors(nge),and the resulting automatic scoring model for the question type of picture description are as follows:Score=2.52+8.223*kcr+0.073*nwords-0.903*rac-0.397*ngeAfter the scoring model is constructed,the performance test of the scoring model is performed based on the original data.The overall correlation between the predicted score and the original score is 0.832,and the agreement rate and the adjacent agreement rate are 70%and 100%respectively.The validity of the scoring features extracted and the scoring model constructed from this study were verified.
Keywords/Search Tags:picture description, Chinese oral test, automatic scoring model, multiple regression analysis
PDF Full Text Request
Related items