| At present, under the environment of rapid development of modern agriculture,the massive information generated by intelligent agriculture raised new challenge for data storage and quick search. Although the traditional relational database can satisfy complicated multi-table associative query, its expandability and data processing performance cannot meet the demands of massive data processing. On the contrary, the emerging Nosql database in recent years has better expandability when processing the massive data; meanwhile, by combining with distributed storage system and parallel computing framework, it can solve the storage as well as calculation of massive data.As the product of combination of traditional agriculture and internet of things,intelligent agriculture can carry out real-time monitoring on the growing environment of crops, analyze the data information it collected, and put forward reasonable regulation program. Based on adopting Hadoop distributed system infrastructure and integrating with the distributed storage system HDFS under hadoop system, parallel computing framework MapReduce as well as Nosql database HBase, this article proposed an improvement scheme of data storage and query, and designed the Parallel Computing algorithm that is more suitable for agricultural sensor data. Besides,according to the actual business demands for the agricultural greenhouses in different producing areas, it created partition function to optimize the efficiency of data query on single producing area.At last, this article established experimental system to carry out performance test on the solution it designed with comparing it with the performance of pseudo-distributed query system and mysql. According to the experimental data, it can be concluded that, the retrieval system designed by this article performs well, which is appropriate for the storage and query of massive data. |