| High entropy alloys have multiscale complex microstructures and highly tunable properties,therefore has great development potential.However,the development of high entropy alloys is still based on trial and error,lacking effective guidance and low development efficiency.Machine learning is a data-based material design technique that has been used in phase composition prediction,mechanical property prediction and optimization,auxiliary simulation calculation and so on.However,the inadequate and unbalanced distribution of available data and the limitations of models lead to significant uncertainties in machine learning-based composition optimization strategies.Based on this,this paper takes supervised learning methods as the core,applies surrogate model optimization and semi-supervised learning methods to the problem of hardness prediction and optimization of high entropy alloys,and ultimately establishes a framework for hardness prediction and optimization of high entropy alloys.In the field of materials,there is a desire to predict materials with properties in specific intervals,especially those that push the limits of existing properties.Multiple machine learning models were evaluated using the leave-one-cluster-out cross-validation(LOCO CV)method,and it was found that the models performed worse when they converged to the upper limit of the hardness of the dataset than when they converged to the lower limit due to the heterogeneity of the component features and hardness distribution of the dataset.The results of the optimization search of the Al-Cr-Fe-Ni-V design space using two intelligent optimization algorithms are consistent with the above expectations.The range of component values of candidate samples(referred to as design space)is usually determined based on expert experience,which increases the risk of the design process.Both PFIC and CMLI metrics are used to filter the high quality and low quality design spaces from existing datasets and perform simulation surrogate-based modeling.It was found that the maximum number of iteration rounds for the high quality design spaces are smaller than the minimum number of iteration rounds for the low quality design spaces,which demonstrates the effectiveness of using PFIC and CMLI criterion to guide agent model optimization.Obtaining sample hardness values by hardness testing has the problem of high cost and long lead time.To rapidly increase the amount of data,a low quality design space is automatically labeled using a semi-supervised learning method,and the resulting data are used to retrain the model.It is found that the R2 of the retrained model improves and the RMSE and MAE decrease compared to the baseline model,demonstrating the effectiveness of using semisupervised learning methods to improve model performance.Finally,based on the above methods,a framework for hardness prediction and optimization of 3d transition metal high-entropy alloys with design space evaluation criterions as the core was established,which can achieve an efficient exploration of the design space while improving the performance of the hardness prediction model. |