| Database structured queries are widely used in various business applications for their high efficiency.Structured queries involve complex restricted expressions,and handling complex restricted expressions usually introduce a large amount of intermediate data.In the era of Big Data,various websites and mobile devices generate a huge amount of data,and searching specific information from these data generates lots of intermediate data.Traditional von-Neumann architectures transfer these intermediate data from main memory to the on-chip cache when processing structured queries,which can significantly increase program execution time.With the gradual failure of Moore’s Law and Denard scaling,the above off-chip transfer problem is becoming increasingly serious.Emerging non-volatile memory with in-memory computation capabilities presents new opportunities to address the off-chip transfer problems.These new devices can perform computations in the memory,which can offload tasks to memory and reduce off-chip data transfers to improve execution efficiency.There are many problems in current in-memory accelerators.First,off-chip transfers remain serious.Current in-memory accelerators play a role as the CPU’s co-processor,so there are lots of data transfers between the CPU and the memory.Second,lacking efficient algorithm support for emerging hardware.Current accelerators choose to map classical algorithms directly to the new devices,which makes it difficult to take full advantage of new devices’ high parallelism.Finally,to support as many structured queries as possible,current accelerators integrate a large number of peripheral circuits into the memory,which improves the computational efficiency while bringing chip area and energy consumption overhead.To address the above issues,this paper includes the following three researches:Cache-based selective query accelerator: To address the problems of off-chip data transfers,large number of random accesses and restricted query types,we propose the cache-based selective query accelerator,Re SQM,which greatly reduces off-chip data transfers and random memory accesses with hardware-software co-design.First,a restriction expression parser is designed to compute restriction expressions in-memory,which makes Re SQM to compute queries independently rather than as the co-processor of CPU.Second,we propose a new data mapping strategy to store intermediate data,which makes Re SQM storing and processing the intermediate data in memory while greatly reducing off-chip transfers.Finally,a double-tagged array is proposed to fill the deficiency that conventional arrays cannot support relative comparison.Experimental results show that Re SQM can achieve significant performance and energy efficiency gains compared to state-of-the-art accelerators.Bit-counting-based sorting query accelerator: To address the problems of lots of random memory accesses,complex peripheral circuits and inability to support string data,we design a bit-counting-based sorting query accelerator,Re CSA.First,we design a dedicated count-based sorting algorithm to greatly reduce the random memory access.Second,a lightweight content-addressable memory is proposed to reduce the peripheral circuit complexity.Finally,new data mapping method is designed for sorting string datasets.Experimental results show that Re CSA achieves significant performance and energy efficiency improvements over state-of-the-art in-memory sorting query accelerators.Column parallelism-based Top-K query accelerator: To address the problem that current Top-K query accelerator cannot efficiently support string datasets,we propose Re SMA,a column parallelism-based string matching accelerator,which can convert string similarity to quantifiable edit distance.First,we design a memory-friendly string filtering algorithm to support highly parallel multi-sets intersection operations.Second,we propose a novel data mapping strategy to convert diagonal parallelism into column parallelism,so that the Re SMA accelerator can handle diagonal parallelism efficiently.Finally,the Re SMA accelerator is coupled with the Re CSA accelerator,which can sort the edit distance and obtain Top-K query results.The experimental results show that our Top-K query accelerator can achieve significant performance and energy-efficiency improvements compared with the state-of-the-art Top-K query accelerator. |