| Wide gap semiconductor(such as Si C,Ga N)can work at high temperature for a long time largely depends on the packaging technology and packaging material selection.Nano silver sintering and copper sintering are common welding techniques for high temperature applications.Adding certain trace elements during sintering can produce solute segregation,accelerate sintering speed,reduce sintering time,improve sintering density and package quality.The trend of solute segregation is determined by the segregation energy.As a popular means of material science research,machine learning can conduct in-depth research on solute segregation energy.There are only a few relevant studies,which put together any possible feature to include as much information as possible about segregation energy,but are likely to suffer from dimensional disasters.We propose an efficient philosophy of feature extraction based on physical nature is proposed to do machine learning of solute segregation energy,with a few physics-informed features reaching an excellent balance between accuracy,data dimension,and thus computational cost.The main contents and results are as follows:We constructed the polycrystalline matrix models of nine binary alloys,such as Ag Ni,and calculated the segregation energy of grain boundary atoms and adjacent crystal atoms at more than 2 million sites using molecular dynamics.The solute segregation energy data sets of nine binary alloys were established.Among them,Physics-informed(PI)characteristics(atomic volume,atomic energy,and disorder factors)are extracted from the local atomic environment based on physical understanding.In order to compare machine learning models established with PI characteristics,Spectral Neighbor Analysis Potential(SNAP)parameters were also extracted.Under three machine learning algorithms and nine alloy systems,the machine learning results showed that PI features were significantly better than SNAP features when used alone or collectively,and that the highest accuracy was always predicted by combining PI features and SNAP parameters.Using only a few PI features is far more accurate than the many features and SNAP features used in the literature work,and gives the best balance between accuracy and feature dimensions,the degree to which depends on machine learning algorithms and alloy systems.Through redundant information analysis,it is found that PI features contain much less redundant information than SNAP parameters.SNAP features also contain interatomic interaction force,stress and other information,while PI features come from the calculation of energy only in equilibrium state.Through Pearson correlation analysis of the characteristics and segregation energy,it is found that the three PI characteristics are closely related to the segregation energy and are independent of each other,because these characteristics have different physical explanation mechanisms for their effects on the segregation energy.Solute segregation is a physical problem,and this work demonstrates the advantages of integrating physics into machine learning from the point of view of feature recognition,rather than machine learning algorithms.The PI feature performs so well in Ag Ni alloys because the potential function uses the most accurate first-principles calculations available,and machine learning is expected to perform better if the potential functions of the other eight alloys are calculated in the same way. |