| Medical big data refers massive data produced because of diseases,outpatients,hospitalizations,physical examinations,health activities,hospital daily management and etc.The development of medical big data is importantly related to and has great strategic significance on people’s daily lives.As unstructured image data,the transmission,storage,and retrieval of medical image data cannot be achieved through normal methods but hospital’s PACS system integrate these all functions.With the increasing emphasis on medical technology and the country’s development,the domestic PACS system has been applied to the whole hospitals in our country.The functions of medical image transmission,storage and retrieval are not intersected in the traditional PACS system,and there are many deficiencies in its performance and technology.The key points of the study are these three key functional modules,and the overall operating efficiency of the system will be improved by the advanced optimization framework and algorithms.The research point is the functions of transmission,storage and retrieval of medical image big data,and it realized by three modules.The transmission part is established on the real time big data streaming processing framework.It analyzes the shortcomings of the image transmission,and proposes the existing improvement of database technology first,uses the advantages in data processing of the computing framework.After setting up the cluster environment,it designs the data transmission task topology,which is divided into three logical components: image acquisition,data compression and file push.Eventually pushes the medical image data from the source address into the optimized database.It proposes a load balancing algorithm based on real-time topology,and an optimized task scheduling algorithm in the transmission.It happens storage hotspot problem when the database writes too large data at the same time,and the storage part uses the node hash algorithm of database to solve it,then integrate Thrift IDL network communication protocol,with optimizing it’s data structure and service interface.The retrieval section studies the needs of different users and scenarios for image retrieval,designs a multi-level retrieval structure,and compares with the retrieval method based on metadata table.After integrating the three functional modules,the entire transmission storage system finally forms a complete information management system,which is used for daily management,monitoring and scheduling of medical image big data in hospitals,medical institutions and etc.The system operation results show that the speed of transmission and performance has been effectively improved,which based on the computing framework.The load balancing algorithm optimizes the resource allocation of the cluster,which based on real-time topology,and it improves the cluster’s operating load capacity,the optimized task scheduling algorithm reduces the communication consumption in cluster processes,and the test data confirms that the system transmission capacity has been effectively improved.It improves the storage hotspot problem of the database in terms of storage,optimizes the Thrift IDL communication model,and improves the overall storage efficiency.After comparing the image retrieval part and the default retrieval method based on metadata table,it proves that the multi-level retrieval module designed effectively improves the overall retrieval speed of the system. |