| With the implementation of the policies such as "Internet +" and "Opinions on deepening the reform of medical and health system",and the development of Internet technology,more and more people are concerned about the upgrading of medical information service.Although China’s medical information system has developed for more than 30 years,the isolation of the medical information system has caused many problems,including the difficulty of sharing files between systems,high management costs,and difficulty in data mining.Moreover,the current storage scheme for medical information system is not perfect.Aiming at the above problems,this thesis studies and designs a medical file storage software based on HDFS,and improves the small file storage strategy and copy storage strategy of HDFS in the process of software implementation.The main work of this thesis is described as follow.1.Aiming at the problem that HDFS is not good at handling a large number of small files,this thesis proposes an improvement strategy for the characteristics of medical files by analyzing the file storage principle of HDFS and other scholars’ improvement strategies.This strategy combines the files by calculating the Sim Hash signature of the file and the Hamming distance between the signatures to determine the similarity of the files to merge the files before storing them in HDFS.According to the test results,compared with the random file merging method,this method has a read speed improvement of 4.89%.2.In view of the randomness of selecting replica storage node,neglecting node performance,and poor load balance of HDFS,this thesis analyzes the source code of HDFS default replica strategy,and proposes an improved strategy combining AHP algorithm and Consistent Hashing algorithm to allocate replica more reasonably.The test results show that compared with the improved strategy based on genetic ant colony algorithm and support vector machine,the strategy proposed in this thesis improves the file reading speed by 3.65%.3.This thesis adopts B/S service mode for software design,and completes the programming of front-end and back-end.The back-end is built with the classic frameworks of struts,spring and Hibernate.According to the requirements,the software has four major functional modules,including user information management module,file information management module,cluster information management module,and small file preprocessing module.4.Build HDFS cluster and software test environment.According to the software quality test standards GB/T25000.51-2016,GB/T25000.10-2016 and GB/T35136-2017,perform functional and non-functional tests on the software,and judge the functional integrity and availability of the software.The test results show that the software meets users’ needs for medical file storage and meets national software testing standards. |