| As an efficient data structure,hash index is widely used in various database management systems.However,in the face of the dramatic increase of data size,the traditional hash index conflict rate is high,and computational efficiency is becoming increasingly prominent.In recent years,with the increasing power of hardware,more and more researchers begin to focus on using machine learning to solve problems in the field of database.It is against this background that learned index is proposed.The core idea of learned index is to use machine learning model instead of index structure.Based on the summary of previous work,we studies the learning-based hash index technology,including:(1)In the existing work,traditional hash functions can not make use of the distribution characteristics of the original data,so there will be a higher hash collision rate,and traditional hash functions can not achieve algorithm parallelization due to data dependence,which results in low computational efficiency.To solve these problems,a learning hash function based on shallow autoencoder is designed.Based on the autoencoder,this method learns the binary hash encoding of data by minimizing the overall loss.Because the learning model proposed in this paper can make full use of the characteristics of the original data and the parallelism of matrix operations,it will produce fewer conflicts and faster computational speed than traditional hash functions.A large number of experiments have proved that,compared with the existing methods,the methods proposed in this paper have obvious advantages in reducing the collision rate,shortening the data calculation time,and improving the efficiency of data retrieval.(2)To meet the practical requirements of high query efficiency and flexibility in information retrieval system,this paper builds a hash index system based on the learning hash index technology proposed in this paper.The system includes feature extraction layer,model layer,storage query layer and application layer.Each functional layer is isolated from each other and collaborates through data transfer to complete specific query tasks.With this system,we can use machine learning models to improve the efficiency of data query.In addition,system components can be replaced flexibly.For specific problems,the system can provide an efficient and easy-to-use data query scheme. |