Font Size: a A A

Text Classification Of Chinese Recipes

Posted on:2023-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:H SunFull Text:PDF
GTID:2531307088468754Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of Internet services and the continuous improvement of the performance of mobile phones and computers,the text data on the Network has grown exponentially driven by the active use of huge numbers of users.In this context,recipes written in Chinese characters appear on the Internet in new ways,such as websites,apps and mini-programs.On the shared network platform,we can quickly and easily search the recipe text,which provides us with reliable data materials for the study of Chinese recipes.Chinese home-cooked food is inseparable from people’s lives and part of Chinese culture.Moreover,There are a variety of Chinese ingredients,cooking techniques and techniques.Therefore,it is of great research value and significance to study the classification of Chinese household cookbook texts,which can also provide technical support for subsequent user interest modeling and feature extraction.In this thesis,the text of homemade dishes is taken as the research object,and the computer algorithm model is used to predict and replace manual classification,which improves the classification efficiency and is beneficial for subsequent analysis and recommendation.The main contributions and innovations of this thesis are summarized as follows.(1)Create a Chinese recipe text corpus.Due to the lack of large open data set,we use crawler technology to obtain the text data of network recipes,and then through data processing,labeling and other operations,we scientifically build the text corpus of Chinese homely-style recipes.(2)Difficulty dichotomous study of homely cookbook text.In order to study the problem of recipe difficulty classification,the recipe difficulty label is defined as "easy" and "difficult",and the conventional text classification process and the current text classification model are used to conduct experiments.It is found that there is little difference in the performance of machine learning models in the study of difficulty classification,and the integrated learning model performs better.(3)Multi-classification research of craft for homely recipe text.Aiming at the multiple classification of cooking techniques in Chinese recipe texts,a text multi-classification model was built by using machine learning to realize automatic labeling of cookbook labels.In this model,TF-IDF and TextRank are used for feature dimension reduction,and six models are combined with three common machine learning classifiers,naive Bayes(NB),logistic regression(LR)and support vector machine(SVM).The Chinese recipes on the Internet are collected into experimental data sets,and the validity of the proposed model is verified through experiments,which provides a feasible solution for the automatic generation of recipe process labels.(4)Multi-label classification of food ingredients for everyday recipe text.Combined with the actual situation,this paper discusses the application of text classification model of household recipes.Using Python language,MySQL,PyCharm,Qt Designer and other tools to build a PC-based recipe system.This thesis sets the types of labels: livestock meat,birds,fish,aquatic products(except fish),fungi,eggs,vegetables,soy products,medicine and food,seasonings,rice and flour,fruits,and converts them into digital codes to form the label set.For example,if a document contains multiple tags,the category tag is in the form of a composite list.The multi-label classification model is established by computer text query and matching method.The main process is to first establish the relation table of food ingredients and types,and then extract the food ingredients words through the text,and then match the extracted food ingredients in the relation table of food ingredients and types.After the match is successful,the corresponding food ingredients label calibration is carried out.The hamming loss and total accuracy,the accuracy of each tag,and the accuracy of each class were used to evaluate the classification effect of the model.(5)Research on the application of recipe text classification.Combined with the actual situation,this thesis discusses the application of the recipe text classification model,and constructs a PC-based recipe system based on MySQL,PyCharm and PYQT platforms using Python language.In the design and implementation of the system,full consideration of practicability,correctness,scalability,the basic functions of the menu such as registration,login,search,add,delete,etc.Among them,the classification model of homely recipe text is applied to the function of adding recipes to realize the automatic generation of recipe labels.At the same time,considering the limitations of the model,when the automatic generation of labels fails,labels can be set manually.
Keywords/Search Tags:Text classification, Recipe text, Binary classification, Multiple classifications, Multilabel classification
PDF Full Text Request
Related items