| Media plays a vital role in human communication and through it we can access all the information we need.Media is divided into forms such as images,sound and text.With the advancement of technology,cross-media information transformation offers a new way of thinking about the problem.This thesis examines the application of image media to sound media in the context of guiding the blind.As most of the hardware and software currently available for guiding the blind are extremely complex and expensive,they are not universally available to all blind people.Guide dogs,a more humane method of guiding the blind,are extremely rare due to the long training process,high training costs and eventual elimination rates.So in response to the above,this thesis designs a low-cost hardware-based guide system,based on a low-cost embedded platform,that achieves the key technologies while also meeting the desired goals and providing assistance to more blind people.The main elements of the research in this thesis are:(1)A requirements analysis of what is studied in this thesis and the overall design of the proposed low-cost,low-computing guide system.(2)A new method of image to sound conversion is proposed.Combining image to sound and image to music algorithms,a new image to sound method is proposed,which converts the image in front into a segment of sound and uses the human ear to listen to this segment of sound to complete the judgement of the presence or absence of obstacles directly in front.The best mapping results are tested to be 25% and 15%more accurate than the two traditional methods respectively.(3)An objective evaluation algorithm for image to sound conversion is proposed.In the image to sound method the audio is judged by the human ear,but there is subjectivity in the human judgement.In this thesis,three objective evaluation algorithms are proposed,including the K-means clustering method,the neural network-based DNN model method and the LSTM model method.In order to compare the effect of the number of feature values on the accuracy of the model test set,the model inputs were divided into single-feature inputs and multi-feature inputs.The test results showed that the DNN and LSTM models with multiple feature inputs have higher accuracy than the same model test set with single feature inputs.In addition,the accuracy of the LSTM model test set is higher than that of the DNN model when both are multi-feature inputs.(4)Testing of the guide system.The first is the testing of the objective evaluation algorithm.With this method,the advantages and disadvantages of multiple graph-to-sound methods can be judged,and the one with the highest accuracy is selected.The results obtained are similar to those of the human ear,which verifies the feasibility of the objective evaluation method.Then,the image to sound function was implemented in a low-cost embedded platform,and the guide system was built for field testing.Firstly,six image to sound methods were used in turn to judge the presence or absence of obstacles on the road directly in front.According to the results,the best results were achieved by mapping the image parameters H,S and V to the sound parameters ,A and f,with an accuracy rate of 82%.In addition,20 friends and family members were asked to equip the guide system and blindfolded for a field test,and the figure to sound method was used to select the best mapping results as described above,showing that 65% of the total number of people had a judgement accuracy of between 80% and 85%,which was as expected. |