| In recent years,driven by policies and technology,sales of new energy vehicles have been increasing year by year,which has become the choice of many families for car purchases.As the core of new energy vehicles,batteries are related to the capability and safety of the vehicle.Accurately estimating and predicting the health status of batteries has a vital role in preventing thermal runaway,second-use of electric vehicle batteries,and control costs of power stations,and has become a hot issue.At present,the research on the health status of new energy vehicle batteries is mostly based on experience,equivalent circuits,and electrochemical models.Such methods is mainly performed in the laboratory,and it has problems such as complex models,high data acquisition costs,and unavailability for practical application.Compared with it,the research based on data-driven methods is not enough.However,data-driven methods have certain advantages in the study of this problem.Firstly,This method can avoid considering the complicated internal mechanism of the battery and the tedious electrochemical principle,and directly estimate the health status.Secondly,with the improvement of the new energy vehicle battery management system,it has realized the collection of rich and massive real vehicle data at a lower cost,which is conducive to data-driven method modeling and tuning,and it is truly able to fully mine the information contained in the data.Therefore,under the existing conditions,for the data collected by the battery management system,it is of certain value to study how to use data-driven methods to obtain a model that is closer to the real scene,more accurate and accurate,and predicts the battery health status.This paper uses the data of a battery management system(BMS)of new energy vehicles,based on a data-driven method,extracts features that affect the health status of batteries,establishes multiple models and attempts model fusion,and proposes a method for estimating battery health status.At the same time,the battery health index sequence data are used to provide a variety of methods to predict the battery health in the future.The main work of this article includes the following:(1)Describe the research background of the health status of new energy vehicle batteries,and clarify the main problems and research significance of current battery use.Analyze the current research status of power battery health,and make theoretical preparations for the follow-up work.Introduce the process of data-driven methods,as well as the machine learning models and time series models used in this article,and analysis tools Python,R.(2)Completed the data pre-processing and battery health index calculation process for a new energy vehicle battery management system.This article pre-processes the data of the new energy vehicle battery management system.The process includes extracting charging data,processing abnormal data,and processing missing data.Calculate the health status indicators of the battery using the AH integration method,analyze the causes of the abnormal points in the calculation results,and use the sliding box plot to process the outliers in the results.(3)Describe the portrait of the user of the new energy vehicle power battery,study the influencing factors of the health status of the battery,and extract the current and historical charging characteristics,battery consistency characteristics,and capacity increase curve characteristics of the charging segment.Four methods were used to establish the battery health assessment model.By adjusting the parameters,the three tree models performed relatively well.At the same time,the Stacking model fusion was tried to enhance the generalization ability of the model.Finally,a method for estimating and updating the health status indicators of batteries is proposed to obtain a relatively smooth health status evaluation curve.(4)Establish a model to predict the future battery health.This article mainly provides two prediction methods.One is to decompose the original battery health state sequence into multiple modular functions through empirical mode decomposition(EMD),and establish a time series model for each modular function.The second is to use multi-step prediction LSTM model for prediction.Compared with directly establishing a time series model,these two methods can better overcome data jitter and predict the trend of battery health attenuation.Finally,the effects of these two models are compared from opposite angles. |