| Bioaccumulation parameter values of chemicals are necessary data for chemicals risk assessment and management.However,using experimental methods to obtain bioaccumulation parameter values has problems such as the time-consuming,high cost and the need for a large number of testing animals,so it is difficult to meet the demand of chemicals management.Quantitative structure-activity relationship(QSAR)is one of the core contents of computational toxicology research,and it is expected to play an significant role in the high-throughput acquisition of bioaccumulation parameter values.In this study,bioaccumulation parameter databases were established,and QSAR models for predicting the bioaccumulation parameters of organic chemicals in fish bodies were constructed by using molecular structure descriptors and machine learning alogorithms.The main research contents are as follows:(1)Developing ensemble models to predict bioconcentration factor(BCF)of organic chemicals in fish.The BCF experimental values of 1384 organic chemicals in different fish were collected from relevant literatures and open source databases,then a BCF database containing testing fish species,experimental conditions and data sources was constructed.5 individual models and 11 ensemble models were developed on BCF of organic compounds in fish using Dragon descriptors and different machine learning alogorithms such as random forest,support vector machine.The performance of optimize ensemble model was evaluated and its application domain was characterized.Results show the ensemble models have better goodness-of-fit,robustness,predictability and wider application domain than the individual models.The optimum ensemble model was further employed to screen the bioaccumulation of chemicals in the inventory of existing chemical substances of China(IECSC).(2)Developing multi-task neural network models that simultaneously predict BCF and biomagnification factor(BMF)of organic chemicals in fish.Based on the backpropagation neural network,single-task neural network models were established to predict BCF and BMF of organic chemicals in fish using different molecular fingerprints and Dragon descriptors.Further,two types of multi-task neural network models that can predict BCF and BMF at the same time were developed,namely multi-task learning with single-input-multiple-output(SIMO-MT)model and multi-task learning with multi-inputmultiple-output(MIMO-MT)model.Results show that compared with single-task models,most multi-task models have greater improvement in prediction performance,which shows there is indeed related information between BCF and BMF predictions.During the training process,they learned and promoted each other,jointly improving the predictive performance.Moreover,the MIMO-MT model has better prediction capacity than the SIMO-MT models,so the former may have bigger development potential.Based on molecular similarity,the application domain of the MIMO-MT model was characterized.It is found that setting an appropriate application domain range can significantly improve the performance of QSAR models.The models to predict fish bioaccumulation parameter values in this study were all constructed following the guidelines on development and validation of QSARs proposed by the OECD,and have undergone rigorous model evaluation and application domain characterization,which overcomes the lack of error analysis and application domain characterization of existing models.These models can provide necessary data for evaluating the bioaccumulation capacity of chemicals and support sound chemicals risk assessment and management. |