| At present,the demand for data analysis has been magnified unprecedentedly.Therefore,higher requirements are put forward for the performance of the columnar database system,which requires faster query speed to mine the value of data more quickly.However,for traditional analytical databases based on SSD and main memory,the read-write performance of external memory often restricts the performance of the system.The main reason is that the performance between main memory and external memory is usually several orders of magnitude worse,and the cost of reading and writing data from external memory is high.With the development of non-volatile memory,the application of non-volatile memory in the database is still being explored.The non-volatile memory has the characteristics of not losing data when the external memory is powered off,and at the same time,it has the advantages of low latency of main memory and byte addressability.Therefore,this paper explores the application of non-volatile memory in database,and uses non-volatile memory characteristics to improve the performance of columnar database system.Based on the above problems,this paper introduces non-volatile memory for the columnar database system with SSD as the external storage device to improve the performance of the columnar database system.The columnar database DuckDB is used as the prototype system to explore,design and implement related issues.At present,cost,performance and capacity are important considerations for developing database systems.Therefore,this paper uses SSD and non-volatile memory as hybrid storage devices,designs columnar database system based on SSD and non-volatile memory for hybrid storage,and explores the application of non-volatile memory as storage devices in the columnar database system,it makes full use of byte addressability,non-volatile and low latency characteristics of non-volatile memory.Based on the characteristics of low latency and non-volatile,this paper avoids frequently reading data from SSD,and improves the efficiency of metadata loading and writing.Based on byte addressability,the layout of data storage is redesigned.Compared with SSD based block reading,the problem of metadata read-write amplification is effectively reduced.The main contributions of this paper can be summarized as follows:1.This paper designs the architecture of the storage engine of the columnar database system for the hybrid storage device of non-volatile memory and SSD,proposes a unified access management mechanism for the non-volatile memory and SSD,which can provide transparent and consistent access to the outside,and proposes a restart and recovery scheme for the hybrid storage engine to ensure that the columnar database can still provide services to the outside after it is restarted.2.This paper proposes a data read-write path scheme based on the characteristics of hybrid storage and non-volatile memory,which uses the byte addressability and low latency of non-volatile memory,and optimizes the data layout of SSD and nonvolatile memory to reduce data redundancy.At the same time,the byte addressability feature of non-volatile memory is used to reduce the problem of metadata read-write amplification.3.This paper proposes a set of heat statistics strategy oriented to the characteristics of columnar database,and designs a hybrid storage engine data read-write scheme and replacement strategy according to the heat and cold degree of the data.It considers making full use of the space of non-volatile memory,and tries to store the "hot"data that is frequently updated in the cache in non-volatile memory,so as to avoid frequent read-write of SSD and alleviate the problem of high read-write latency of SSD.This paper designs and implements the above design scheme and optimization in the open source columnar analytical database system DuckDB,and conducts a large number of experiments to verify its correctness and effectiveness.Experiments based on TPC-H,TPC-DS standard test benchmarks show that the hybrid storage engine for non-volatile memory and SSD can achieve nearly 20%performance improvement compared with the previous SSD external storage scheme. |