| Data acquisition is an important part of industrial control and monitoring, and it is the source of data processing, analysis and display. "Industrial Internet" and "Industry 4.0" are the developing direction of industry in the future. Both of them utilize the industrial big data and Internet of things(IoT) to improve the industrial productivity. With the IoT being widely used in the industrial field, a large number of various sensors have been deployed in the industrial environment. So, the traditional industry began to face the problems of mass data acquisition. The traditional SCADA system used for data acquisition in the industrial environment has problems in data fusion, extension, and has low universality and flexibility. So we need to design a mass data acquisition system which can overcome the problems above to meet the needs of industrial data acquisition in the future.In this paper, we made a data acquisition system for industrial big data by referencing the mature solution of mass data acquisition in the Internet industry. By comparing the similarities and differences between the Internet industry and industrial field, and referencing the ETL technology, streaming data processing technology and messageoriented middleware technology, we optimize these key technologies to let them can be used in industrial data acquisition. Our data acquisition system use the Kafka message system and the whole steps of the data acquisition are as follow. First, the sensors put the data to high-performance data acquisition nodes. Then data acquisition nodes package the acquired data into messages and publish the messages to the Kafka cluster. At last, data processing nodes process the data got from Kafka cluster and load the results to database. Data processing nodes have good distributed characteristics, and the data processing module has pluggable ability to support different protocols and databases. So, our data acquisition system can provide good support for data fusion, and has good scalability, versatility and flexibility.In the tests section, our data acquisition system is deployed in the cluster composed by 5 Linux server, and a performance test of our data acquisition system is took in the environment which simulated the real industrial data acquisition environment and data transportation model. The results of the tests show that our data acquisition system can solve the problems in the massive data acquisition environment. |