Font Size: a A A

Exploratory Data Analysis Based On Chinese Online Recipes

Posted on:2020-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:Z P LiuFull Text:PDF
GTID:2431330626464272Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of food-focused websites and applications in China,more and more cooking recipes are published on the websites by the people from different regions,which provides us rich data resources for the study of online recipes in China.As a big country with a large population,China has generated a series of different cuisines.The most famous of them are eight major cuisines.However,lack of research focus on Chinese recipes from the aspect of data analysis.We adopt data crawler technology to extract enough recipe data,and use these recipe data as our research object to analysis the Chinese cuisine.In this paper,we explore Chinese online recipes from the following aspects.Firstly,we explore the diversity of ingredients,including the diversity of ingredient consumption and combination.Then we explore the notable ingredients of recipes between various Chinese cuisines,and the results are visualized in the form of word cloud.Secondly,we explore the complexity of Chinese online recipes from three aspects: the number of ingredients,cooking time and cooking way.Thirdly,for the recipe feature vectors composed of ingredients,taste and cooking way,we use similarity algorithm to analyze the similarity between them.In this way we can use data visualization technology to explore the relationship between the cuisines in which these recipes belonged to.Finally,we try to find the frequent items for minor ingredients of Chinese online recipes,such that we can obtain the frequently combinations of minor ingredients for Chinese cuisines.In addition,labels are widely used on many social networks and applications,like Weibo and Twitter.Many contents within them have been given a variety of labels in order to facilitate topic search.For online recipes,their labels indicate the dietary functions,such as “calcium supplementation”,“antioxidant” et al.Therefore,the requirements for multi-label automatic classification are becoming more and more extensive.However,the traditional machine learning methods usually need complex feature engineering to deal with the classification tasks,and most of them are binary classification methods that needed to be transformed into multi-label classification methods.In order to achieve better performance of multi-label classification,we complete the multi-label classification task of Chinese recipes by using convolution neural network and recurrent neural network on the deep learning platform,and also compare their performance with the machine learning classification methods.At the same time,we try to expand the training sets of Chinese recipe data by using text data augmentation technology for better performance.
Keywords/Search Tags:Online recipe, Data analysis, Data mining, Deep learning, Multi-label classification
PDF Full Text Request
Related items