| Protein function prediction has been a highly regarded research area in computational biology in recent years.With the widespread application of deep learning across various domains,researchers have begun to explore the integration of deep learning techniques into protein function prediction methods.Gene Ontology(GO)has emerged as one of the primary frameworks for describing protein functions and their relationships,providing strong support for deep learning approaches.In traditional protein function prediction methods,GO term representation learning plays a crucial role.The objective is to learn low-dimensional dense vector representations,also known as embeddings,for each functional label,which encapsulate rich semantic information.However,existing GO term embedding methods primarily consider ancestor co-occurrence information and do not fully capture the entire topological information within the GO Directed Acyclic Graph(DAG).To address this issue,in this study,we propose a novel GO term representation learning method called PO2Vec,which leverages partial order relations to enhance the representation of GO terms.Extensive evaluations demonstrate that PO2Vec outperforms existing embedding methods in various downstream biological tasks.Building upon PO2Vec,we further develop a new protein function prediction method named PO2GO,which exhibits superior performance across multiple benchmark metrics,annotation specificity,and limited sample prediction capabilities.These results underscore the critical importance of high-quality GO term representations in computational tasks such as protein annotation.In summary,this research introduces an improved protein function prediction method,PO2GO,based on deep learning approaches,showcasing its superiority across multiple tasks.This study provides a valuable reference for understanding the relationship between protein function and amino acid sequences,offering a new avenue for related research in computational biology.Key contributions of this research include:1.Introduction of a novel model,PO2Vec,for GO term representation learning.2.Comprehensive capture of GO topological information through within-path and between-path partial order constraints,surpassing existing GO DAG-based approaches.The effectiveness of PO2Vec is demonstrated through experimental analyses from five perspectives.3.Proposal of a novel protein function annotation prediction model,PO2GO,constructed by integrating PO2Vec and the protein language pre-training model ESM-1b.Superior performance of PO2GO is validated through benchmark comparisons. |