| Energy demand has increased dramatically in the past few years, for the rapid economic development of our country. Also, we are facing the emergency and energy shortages in successive years, supply and demand imbalance and relying on foreign oil. China has put forward a vital way to solve the issue, namely improving the energy structure and utilizing new and renewable energy reasonably and effectively. Biomass is an important part of renewable energy, but a vast biomass resource is discarded each year. It is assumed that the development and utilization of biomass will be of great significance for breaking through the issue of energy shortage.For now, the utilization of biomass energy is mainly focused on the manufacture of fuel ethanol and biodiesel. However, subject to the manufacturing process and technology, real-time and on-line detection of component content in biomass cannot be realized by traditional chemical methods. So it leads to the problem of fuel quality and output for lacking the accurate ratio. Soybean straw is used as research object in the paper and near infrared spectral analysis technology is adopted to detect its content of cellulose, hemicellulose and lignin. It is expected that the technology could complete the real-time and online rapid detection by its improvement and maturation step by step.In the first place, a large number of different areas and varieties of soybean straw are collected in Heilongjiang province and its total number is 160. Then, they are scanned and calibrated. Finally, various kinds of data processing methods and modeling approach are demonstrated by NIR. The main results are as follows:(1)Cellulose, hemicellulose, and lignin are normal analyzed to make sure its representativeness. By establishing the frequency histogram and distribution of normality test chart of three components, we get Boolean variable(h=0) and significance evaluation indices(sig=1). If confidence interval in 95% has hypothesized mean, it means the pattern sample is representative.(2)Carried on the multiple correlation analysis of soybean straw spectrum, its cross correlation is low. However, low cross correlation can maintain the stability of the model and reduce the error. Relationship between the chemical and spectral measurements is stable. hemicellulose within 4000-12000cm-1 with progressive approach is presented by the negative correlation related to trends. Also, most of the wavelength correlation coefficient keep above 0.4, which is suitable for model building. Lignin and cellulose correlation coefficient is stable in 7500 cm- 1, which is also fit to establishing models.(3)Markov distance, hotelling T2 statistics, x-y residual abnormal samples were selected and put forward a method that based on x-y residual value and leverage the 3D view of analysis on abnormal sample selection. It turns out that four methods can effectively find the exact abnormal samples, but four discriminant algorithm on the choice according to the different has the difference. Meanwhile, the model of quantitative analysis of cellulose, hemicellulose, and lignin in the 3 d view analysis method are accurate and effective to find the abnormal samples. And sample out of precision is improved, the calibrating determination coefficient R2, respectively, of 7.76%, 12.09%, 9.04% and interactive authentication, down 0.076, root mean square error RMSECV 0.1731, 0.0942.(4)Wavelet transform are used to deal with the noise spectrum. Wavelet, respectively DBN Harr wavelet and Symlet wavelet under different layers choose penalty global threshold, Bridge-massart threshold value and the default threshold method for spectral signal decomposition and reconstruction and compare with other traditional denoising method. Finally Symlet2-2 half-and-half cellulose and lignin decomposition are chosed and DB2-2 second decomposition wavelet is processed. The spectrum processing after validation set decision coefficient from 0.462524, 0.653223, 0.6314158, 0.462524 respectively.(5)Respectively choosing characteristic band selection of cellulose, hemicellulose and lignin : correlation coefficient method, IPLS,SPA, GA and MWPLS. Among them, IPS interval partial least squares chooses the interval between 50 and 70 cases with characteristics of band selection, using the SPA and continuous projection algorithm in the full spectrum band. IPLS characteristics optimization maximum show effective wavelength band selection m_max = 10, 20, 30, 40 for feature selection. Characteristic wavelength selection are chosen by GA with the full spectrum and the features of IPLS optimum bands of evaluate=50, 100 cases. Compared to the original spectrum, model verification results are improved.(6)Hemicellulose and cellulose correction models are established by BP neural network and compared to PLS partial least squares models. BP neural network model selects the optimal parameter and hidden layer nodes is 20, the momentum factor and learning rate of 0.6, 0.6 number 3500 BP model study.(7)For lignin correction model, support vector machine regression(SVR) model are used and compared with PLS. SVM of penalty factor C, nuclear parameter R and validation set mean square error(RMSEP demonstrated the relationship and the four kinds of kernel function selection are compared respect.The improvement of biomass utilization is significant for providing reliable data security and technical support of biological fuel oil production. Meanwhile, it is also a kind of guidance to provide more in-depth theoretical and practical support. |