Font Size: a A A

Tngredient Based Food Recognition

Posted on:2021-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:L H LiuFull Text:PDF
GTID:2381330602977683Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Food is very essential for human life and it is fundamental to the human experience.With the rapid development of social networks,mobile networks,and Internet of Things(IoT),people commonly upload,share,and record food images,recipes,cooking videos,and food diaries,leading to large-scale food data.Researchers can use these food data to do various research in the food filed such as food recognition,food retrieval,and so on.Food recognition is the basis of research in the food field and gained more attention in the many communities due to its various applications,e.g.,multimodal foodlog and personalized healthcare.Most of existing methods directly extract visual features of the whole image using popular deep networks for food recognition without considering its own characteristics.Compared with other types of object images,food images generally do not exhibit distinctive spatial arrangement and common semantic patterns,and thus are very hard to capture discriminative information.With the development of the mobile internet,users not only upload a large number of food photos but also provide the ingredient information.Just like the importance of objects to the scene,ingredients within food images are also very important for food recognition.Moreover,many research results proved that using semantically meaningful food ingredients can be used as attribute information for food image recognition.It provides complementary information from different perspectives and granularities to improve the recognition performance of food images.Furthermore,although food typically does not exhibit distinctive spatial arrangement,we can explore image patches from different scales and then fuse them into multi-scale representation.Such representation can fuse patch features from the coarse scale to the fine scale,and thus their features contain information from discriminative image regions.In addition,multi-scale fusion can be more robust to the geometrical deformation.Therefore,in this thesis,we make research on food recognition based on food image ingredient information.The main research contents and contributions are as follows:(1)This thesis proposes a Multi-Scale Multi-View Feature Aggregation(MSMVFA)scheme for food recognition.We utilize additional ingredient information to fine-tune the deep network to extract mid-level attribute features.The high-level semantic features and deep visual features are extracted from class-supervised deep neural network.MSMVFA can conduct two-level fusion,namely multi-scale fusion for each type of features and multi-view aggregation for various types of features with different granularity to produce more robust,discriminative and comprehensive fine-grained representation.(2)This thesis proposes an Ingredient-Guided Cascaded Multi-Attention Network(IG-CMAN)for food recognition,which is capable of sequentially localizing multiple informative image regions with multi-scale from category-level to ingredient-level guidance in a coarse-to-fine manner.These regional features generated from the network under the supervision from different granularity are very complementary.Therefore,integrating diverse regional features can lead to more comprehensive and discriminative representation(3)This thesis presents a new food dataset,which is very complementary to existing datasets for food recognition with ingredients.It contains 200 food categories from the list in the Wikipedia,about 200,000 food images and 319 ingredients.It will be made publicly available to further the development of scalable food recognition.
Keywords/Search Tags:Food recognition, Convolutional neural network, Ingredient information, Multi-scale, Multi-view
PDF Full Text Request
Related items