| At present, more and more applications need to translate visual data into natural language. Research on the association between visual information and describing language has become an important issue.VIMAC system is a visual-information-based Chinese vocabulary acquisition system which is developed by Intelligence Science and Technology Center of Beijing University of Posts and Telecommunications.The system is based on the collection of image-language description, in order to build the representation of language and vocabulary based on visual information, and can be used in the automatic generation of image description.This paper studies the work of VIMAC system. On one hand, making use of VIMAC system research results we get more fine-grained languages-visual information pairs and focus on solving the feature representation of unknown words which do not appear in the training corpus.On the other hand, we provide a new support for the improvement of the VIMAC system that our system can be used in the image retrieval purpose by discovering and locating the corresponding visual object's describing vocabulary from the external description of the image.From VIMAC-based acquisition system, we can get the corresponding relationship between vocabulary categories and visual features, so the key of aligning the visual characteristics of images and their description words is to determine the category of the words, that is, the word classification problem. By means of the pretreatment of the describing sentence of the image including segmentation and POS tagging, we extract words relating to color, size, location and shape from image description and classify them based on How Net corpus. Ultimately we achieve the alignment of the words and their corresponding visual properties of the images.This paper also analyzes several key factors that influence the final performance.Experiments show that after the treatment of part of speech tagging, the classification accuracy is greatly improved.When the training corpus gradually increases,classification performance is also gradually increased,but when the corpus reaches a certain size, the performance tends to be basically saturated. After that, we established a database to store image information and its annotations for easily adding, extracting, modifying and managing. Finally, in order to display visually, we generate a dynamic web page to achieve a specific image information search by using ASP technology. |