| Rice grading is a food quality assessment topic,which aims at assessing the quality of rice kernels sampled from a large amount of rice samples.This topic is very important in food security and has been achieved extensive attention in both academia and industry.However,the research progress is limited because(1)a rice kernel has small sensing field in cameras which makes the detection of a single kernel difficult;(2)overlapped rice kernels in images are hard to detect and recognize;and(3)high-quality and low-quality rice kernels may exhibit similar appearances in a single image,which makes the kernels undistinguishable.Targeting at the above issues,in this work,we first design a rice capturing device which transports the rice kernels in a controlled environment such that the kernels can be streamed one-by-one.This device enables us to photograph a single rice kernel from three different visual directions.Based on this device,we then design a streamlined system for rice grading,which consist of four modules:image sampling,feature extraction,multi-view fusion,and rice classification.The image sampling module produces three images for a single rice kernel,with each exhibiting a different part of the kernel.The feature extraction module extracts features from the three captured images.Considering the multi-view property of the photographing process,the multi-view fusion module takes the extracted features as input and generates a fused feature.The fusing process exploits the redundancy,complementary,and importance of the features of different views,and minimizes the information loss to achieve the final feature.The last rice classification module labels the rice kernel based on the fused feature.In the above streamline,we employ deep convolutional neural networks as the feature extractor and investigate the performance of different network architectures,trying to find a proper deep model for rice grading.On the other hand,we also propose a novel rice feature fusion technique based on multi-view intact space learning.This technique naturally fits the defined problem since the captured images from the device are multi-view representations without any information loss.It is assumed that each view of information is a mapping point of a feature in a certain intact feature space.By optimizing a generative model,the method produces the fused feature which possesses high discriminability.To facilitate the optimization and experiments,we also construct a rice-grading dataset named FIST-Rice.Extensive experiments on this dataset demonstrate the superiority of the proposed system and the proposed multi-view fusion technique for rice grading. |