Hepatitis B virus (HBV) infection is a global health problem, with 2 billion people infected worldwide, and more than 350 million suffering from chronic HBV infection. The 10th leading cause of death worldwide, HBV infections result in 500,000 to 1.2 million deaths per year caused by chronic hepatitis, cirrhosis, and hepatocellular carcinoma (HCC). HBV belongs to the genus Orthohepadnavirus of the Hepadnaviridae family. During the progress of HBV replication, the lack of proofreading function of HBV DNA polymerase easily leads to HBV mutations. At present, based on a nucleotide sequence divergence of greater than 8% over the entire viral genome or more than 4% over the S region, HBV has been classified into eight genotypes, A through H. Furthermore, according to a nucleotide sequence divergence of great than or equal 4% and less than 8% in the same genotype, HBV genotypes have been classification into different subgenotypes. And one of the most characteristic features of the eight currently known HBV genotypes is their distinct geographical distribution and population specificity. Therefore, it is necessary to further study the spatial genetic structure of HBV heterogeneity within different geographic regions and different populations. The study on spatial genetic structure of HBV variability have many theoretical significances in many aspects, for example, in elucidating the spatial heterogeneity of HBV variability, exploring the spatial tracks of HBV variability, analyzing the relationship between HBV tracks and human migrations, and partitioning geographic boundaries of HBV spatial heterogeneity; and provide the scientific basis for taking regionalized HBV prevention measures (including multi-epitope HBV vaccine, the tracing of HBV infection source in molecular epidemiology, and regionalized, HBV personalized treatment protocols and so on).At present, many researchers studied the geographic genetic structure of HBV variability mainly based on the traditional clustering method to reconstruct phylogenetic trees, identified the genetic structure of HBV variability from topological structure of HBV phylogenetic tree, and further analyze whether these HBV sequence with higher homology correspond to a specific geographic regions. However, the traditional methods of clustering analysis used in these studies are short of considerations of the HBV isolated locations (populations) of the spatial coordinates. And therefore they can not further reveal the spatial genetic structure of HBV heterogeneity.In this study, within the GIS framework, the 2-D graphic minimal spanning tree model with the improved Monmonier's algorithm model are used to analyze the spatial genetic structure of HBV heterogeneity from two aspects of spatial structure connectivity and boundary identifications of spatial structure.The main results:1. Geographic distribution characteristics of HBV genotypes and subgenotypesThe phylogenetic tree of HBV complete genome sequences shows obvious structured characteristics. The phylogenetic tree is divided into eight branches, corresponding to a particular genotype and its subgenotypes, and most HBV genotypes and subgenotypes have their specific dominant geographical regions. The results indicate that in the whole world, the genetic structure of HBV sequence variability has relatively independent genotype and subgenotype topology, and there are some recombinants in certain braches under the influence of some factors such as host immune pressure, etc.2. Spatial genetic structure of HBV sequence variability(1)The 2-D graphic minimal spanning tree of HBV sequences of all the HBV genotypes and subgenotypes derived worldwide shows the spatial genetic structure characteristics.â‘ The whole minimal spanning tree is partitioned into four braches namely America branch, Africa branch, Europe branch and Asia-Oceania branch.â‘¡Each branch shows obvious geographic structure characteristics, and obvious geographic and social isolations characteristics are shown between branches. And the oceans are the main natural barriers of geographic isolation of HBV variability. For example, although New Zealand, Australia and South Pacific islands (sampling locations No.2,8,9,10 and 15) are adjacent to American continent, there are not connectivity branches because of the isolation of the ocean. Another example is that although South Africa, Madagascar, Malawi, Uganda and Somalia (sampling locations No.3,6,11,19 and 24 in Africa) are adjacent to Southeast Asia and Oceania, there are not connectivity branches because of the isolation of the ocean. However, HBV strains in the continent without the isolation of the oceans are connected in general. For instance, HBV strains from each sampling locations in African continent are connected and the whole Africa branch is connected with the Asia-Europe branch by the Middle East. Another example is that HBV strains from each sampling locations in Oceania also are connected and the Oceania branch is connected with the Asia-Europe branch. And HBV strains derived from South and North America form a connectivity branch. Therefore, this study reveals the essential characteristics rules of HBV biogeography, and the results have great significances in deeply understanding of the characteristics of HBV evolution, geographic isolation, and the spatial relationship between HBV and hosts. Simultaneously, the results have practical values in taking regionalized HBV prevention measures (including multi-epitope HBV vaccine, the tracing of HBV infection source in molecular epidemiology, and regionalized, HBV personalized treatment protocols and so on).(2)Most HBV genotypes and subgenotypes also show significant spatial genetic structure features.â‘ The minimal spanning tree of HBV genotype A is partitioned into three braches namely Africa branch, Europe branch and Asia-North America branch. Each branch shows obvious geographic structure characteristics, and obvious geographic and social isolations characteristics are shown between branches. For instance, HBV strains from Africa and Europe respectively form a comparatively independent branch. However, despite of the isolation of the Pacific between Asian continent and American continent, there are still many connectivity branches. The results indicate that the high similarity of HBV sequences between Asia and America might relate to the frequent population migration and mixture in modern times, and also might relate to its wide host adaptability.â‘¡The minimal spanning tree of HBV genotype B shows that though there are a few branches connected with HBV strains in Europe and America, the principal part of the minimal spanning tree is still in Asia and obvious isolations characteristics are shown between branches.â‘¢The 2-D minimal spanning tree of HBV genotype C is partitioned into two braches namely Asia branch and Oceania branch. Each branch shows obvious geographic structure characteristics, and obvious geographic and social isolations characteristics are shown between branches. Of which HBV strains from Asia and Oceania respectively form a comparatively independent branch.â‘£The 2-D minimal spanning tree of HBV genotype D is partitioned into two relatively independent braches namely Asia-Africa-Europe-America branch and Oceania branch. The hosts of HBV in Australia, America and Europe are Caucasian, and the hosts in Asia are the yellow race. These indicate that HBV genotype D has wide host adaptability.⑤The 2-D minimal spanning tree of HBV genotype E forms an independent branch that is restricted to Africa.â‘¥The 2-D minimal spanning tree of HBV genotype F is restricted to Central and South America, but there are a few branches connected with HBV strain in Europe and America. The results are similar to the conclusions of the traditional studies on HBV geographic distributions. However, the results further elucidate that most genotypes and subgenotypes have specific susceptible hosts. The results have great theoretical and practical significance in developing multi-epitope HBV vaccine, and taking regionalized, HBV personalized treatment protocols for specific population.3. The spatial tracks of HBV sequence variability(1)The spatial tracks of HBV sequences of all the HBV genotypes and subgenotypes mainly move with certain orientations along the coastline. HBV strains in the African continent goes southward along the west coastline of Africa to South Africa, and then northward along the east coastline to the Middle East; it is diverged into two branches in the Middle East, one extends to the European counties along the Eurasian continent, and the other branch extends into the South Asia and Southeast Asia and gradually extends northward to north China even the Russian (This extended track also shows two separated routes, the west route extends to western China, and goes northward along the Tibeto-Burman corridor, and then northwestward along the Silk Road extends to Siberia; and the east route extends northward from Southeast Asia to north China along the east coastline, at the same time extends to the south Pacific island countries until reaches Oceania.) Asia branch goes along the east coastline of Asia northward to Kamchatka Peninsula, Russia, passes the Bering Strait to North America, and finally reaches South America. The spatial tracks of HBV variability are basic coincidence with the gene flowing tracks of human out of Africa. The aforementioned results indicate that HBV variability spreads in the world with human migration, and forms the migration tracks in coincidence with population migration. This has a great significance for the further study of the evolutionary history of HBV sequences.(2)The results of this study also indicate that although most HBV genotypes and subgenotypes have their dominant geographical regions, the same genotype still shows particular spatial tracks in different spatial locations.â‘ The minimal spanning tree of all the complete genome sequence of HBV genotype A derived worldwide shows connectivity characteristics, and also has obvious spatial tracks of HBV sequence variability. The spatial tracks mainly move with certain orientations along the coastline. The complete genome sequences of HBV genotype A in African continent are diverged into two branches. One branch goes along the west coastline of Africa to Gambia, goes eastward to Somalia, and then extends northward along the east coastline to the Middle East; another branch extends to the Middle East along the East coastline of Africa. And the two branches are integrated in the Middle East, and then go to South Asia and Southeast Asia. HBV strains in European continent extend southward along the coastline of the Mediterranean Sea to Asian continent. And many connectivity channels in HBV strains of genotype A between Asian continent and African continent indicate their close relationships with each other.â‘¡HBV genotype B is mainly distributed in Asia and the spatial tracks are not obvious. This result indicates that HBV genotype B has particular susceptible populations (Asian population).â‘¢HBV genotype C is mainly restricted to Asia and the spatial tracks are not obvious. This result indicates that HBV genotype C has particular susceptible populations, but Asian population and indigenous populations in Oceania are prone to infect with HBV genotype C.â‘£HBV genotype D shows connectivity tracks in latitudinal direction that could relate to population migration and mixture.⑤HBV genotype E extends westward and northward from Madagascar, and then passes Mozambique Channel and reaches Southwest Africa (Angola, Namibia and Zambia) and South Africa (Senegal, Cote d'Ivoire, Ghana, Benin, Nigeria and Cameroon). These indicate that the directional expansion and epidemic tracks of HBV genotype E.â‘¥The spatial tracks of HBV genotype F show south-north direction along Central and South America. These results help to further study the evolutionary history and spatial flowing tracks of the same genotype, and to look for HBV tracing biomarker of molecular epidemiology.4. Geographic boundaries of HBV sequence variability(1) This study indicates that HBV sequences have many statistically significant geographic boundaries in the global scope. These geographic boundaries would partition the world into several relatively independent geographic regions with genetic homogeneity.â‘ HBV variability of all the HBV sequences derived worldwide is diverged into two regions:New world region (South and North America) and Old world regions (Asia, Africa and Europe) based on the Pacific.â‘¡HBV sequences in New world region have higher homogeneity.â‘¢In Old world region, the spatial structure of HBV sequences is complicated, the spatial heterogeneity is higher and the entire region is further divided into different regions by many sub-boundaries. Of which Africa becomes a comparatively independent homogeneous region, and Europe, Asia and Oceania form another comparatively independent homogeneous region. Siberia and some regions of North America where are close to the Arctic Circle form a comparatively independent homogeneous region. The results are basic coincidence with geographic boundaries of human population genetic structure. These fully indicate that HBV sequence variability is in agreement with the genetic structure of their hosts, the particular populations with genetic homogeneity are prone to infect HBV of a certain genotype and subgenotype, and a certain HBV genotype and subgenotype is prone to prevail and survive in specific populations.(2)The geographic boundary analysis is also used to identify geographic boundaries of HBV sequence variability according to different genotypes. The results show there are many statistically significant geographic boundaries within the same genotype, and particular geographic regions with higher genetic homogeneity are partitioned by these boundaries.â‘ HBV genotype A and its subgenotypes are diverged into several comparatively homogeneous regions by the geographic boundaries:HBV genotype A and its subgenotypes are diverged into two major regions by boundaryâ… in Figure 22, namely south region and north regions. The north region contains Central Asia, East Asia, Siberia and North America, and HBV subgenotype A2 is mainly distributed in this region. The south region enclosed by boundaryâ… ,â…¡andâ…¢contains Southeast Asia, the Middle East and the Eastern and Southern Africa, and HBV subgenotype A1 is mainly distributed in this region. The homogeneous region in Northwest Africa is enclosed by boundaryâ…¢andâ…¤, and HBV subgenotype A3 is mainly distributed in this region. The homogeneous region in West Europe is enclosed by boundaryIV,â…¡andâ…¤, and HBV subgenotype A2 is mainly distributed in this region. In fact, the north region and the homogeneous region in West Europe are very close in the geographic distance.â‘¡HBV genotype B and its subgenotypes are diverged into several comparatively homogeneous regions in turn by the approximately parallel southwest-northeast geographic boundaries:The homogeneous region in North America is enclosed by boundaryâ… in Figure 23, and HBV subgenotype B6 is mainly distributed in this region. The homogeneous region enclosed by boundaryâ… andâ…¡mainly contains Japan, and HBV subgenotype B1 is mainly distributed in this region. The homogeneous region enclosed by boundaryâ…¡,â…¢andâ…£mainly contains China, and HBV subgenotype B2 is mainly distributed in this region. The homogeneous region enclosed by boundaryâ…¢andâ…¤mainly contains Southeast Asia, and HBV subgenotypes B3 and B5 are mainly distributed in this region. The homogeneous region enclosed by boundaryâ…£,â…¤,â…¥andâ…¦mainly contains Vietnam, and HBV subgenotype B4 is mainly distributed in this region. The homogeneous region enclosed by boundaryâ…¥andâ…¦mainly contains Thailand and Switzerland, and HBV subgenotype B2 is mainly distributed in this region.â‘¢HBV genotype C is restricted to Asia and Oceania, and is still diverged into two major regions by a southeast-northwest boundary, namely south region and north regions. The north homogeneous region contains Japan, Korea, northern China and Uzbekistan, and HBV subgenotype C2 is mainly distributed. In this region, the spatial structure of HBV sequences is complicated, the spatial heterogeneity is higher and the entire region is further divided into different regions by many sub-boundaries. Of which Hawaii becomes a comparatively independent homogeneous region, and Okinawa, Japan forms another comparatively independent homogeneous region. The south homogeneous region contains Southeast Asia, southern China, Australia and South Pacific islands, and HBV subgenotypes C1, C3, C4 and C5 are mainly distributed. In this region, the spatial structure of HBV sequences is complicated, the spatial heterogeneity is higher and the entire region is further divided into different regions by many sub-boundaries. Of which Southeast Asia becomes a comparatively independent homogeneous region, and HBV subgenotype C1 is mainly distributed; Philippine becomes a comparatively independent homogeneous region, and HBV subgenotype C5 is mainly distributed; Australia becomes a comparatively independent homogeneous region, and HBV subgenotype C4 is mainly distributed. And South Pacific islands also become a comparatively independent homogeneous region, and HBV subgenotype C3 is mainly distributed.â‘£HBV genotype D is widely distributed in the world and diverged into several relatively homogeneous regions by boundaryâ… ,â…¡,â…¢andâ…£in Figure 25 including America, Africa, South Asia-Southeast Asia (India), Australia and Europe-Central Asia-East Asia-Siberia homogeneous regions. In the Europe-Central Asia-East Asia-Siberia homogeneous region, the spatial structure of HBV sequences is complicated, the spatial heterogeneity is higher and the entire region is further divided into different regions by many sub-boundaries.⑤HBV genotype E is restricted to Africa, and is diverged into two comparatively independent homogeneous regions by boundaryâ… in Figure 26:south homogeneous region and north homogeneous region. In the north homogeneous region, the spatial structure of HBV sequences is complicated, and the entire region is further divided into different regions by many sub-boundaries. At present, the subgenotypes of HBV genotype E are still not defined, but HBV genotype E shows comparatively independent homogeneous subpopulations.â‘¥HBV genotype F is mainly distributed in Central and South America, and is diverged into two comparatively independent homogeneous regions by boundaryâ… in Figure 27:south homogeneous region and north homogeneous region. The north region contains Central America and USA, and HBV subgenotype F1 is mainly distributed in this region. The south region contains Central and South America, and HBV subgenotype F2 is mainly distributed in this region. The aforementioned results indicate that although most genotypes have the limited geographical distribution regions, there were still geographic boundaries in the limited regions. These limited geographic regions of a certain genotype are partitioned by these geographic boundaries into geographic regions with higher genetic homogeneity where the corresponding HBV subgenotype can be found. HBV subgenotyes also have particular susceptible populations. The results help to further analyze HBV variability in different geographic sub-regions, to develop multi-epitope HBV vaccine and to take regionalized, HBV personalized treatment protocols.The main conclusions:1. The genetic structure of HBV sequence variability has relatively independent genotype and subgenotypes topology in the whole world, and there are some recombinants in certain braches under the influence of some factors such as host immune pressure etc.2. Not only HBV variability of all the genotypes but also HBV variability of each genotype show particular spatial genetic structure characteristics. Obvious geographic and social isolations characteristics are shown between branches of the 2-D graphic minimal spanning trees. And the oceans are the main natural barriers of geographic isolation of HBV variability.3. Not only HBV variability of all the genotypes but also HBV variability of each genotype shows obvious spatial tracks. The spatial tracks of all HBV sequences variability mainly move with certain orientations along the coastline. The spatial tracks of HBV variability are basic coincidence with the gene flowing tracks of human out of Africa. This indicates that HBV variability spreads in the world with human migration, and forms the migration tracks in coincidence with population migration. This has a great significance for further studying of the evolutionary history of HBV sequence and the spatial spread track, looking for the tracing marker of molecular epidemiology.4. HBV variability of all the genotypes and each genotype could be partitioned into several geographic regions with high genetic homogeneity by many statistically significant geographic boundaries. These boundaries and regions are basic coincidence with the geographic boundaries of human migration. This indicates that HBV sequence variability is consistent with the genetic structure of their hosts, the particular populations with genetic homogeneity are prone to infect HBV of a certain genotype and subgenotype, and HBV of a certain genotype and subgenotype is prone to prevail and survive in particular populations.5. The spatial genetic structure characteristics of HBV heterogeneity contribute to deeply understand HBV evolution characteristics, geographic isolation characteristics and the spatial relationshipa between HBV and its hosts and so on. Simultaneously, it is also helpful to develop HBV multi-epitope vaccine, to establish regionalized, HBV personalized treatment protocols and to look for HBV tracing biomarker of molecular epidemiology.The main innovations:1. This study confirms that the HBV variability of all the genotypes and each genotype have particular spatial genetic structure characteristics. The obvious geographic structure characteristicis and geographic, social isolation characteristics are shown between branches of the 2-D minimal spanning trees. And the oceans are the main natural barriers of geographic isolation of HBV variability.2. This study has depicted the spatial tracks of HBV variability of all the genotypes and each genotype. The spatial tracks mainly move with certain orientations along the coastline. The spatial tracks of HBV variability are basic coincidence with the gene flowing tracks of human out of Africa. It contributes to further study the evolution history and spatial flowing tracks of the same HBV genotype, and look for HBV tracing biomarker of molecular epidemiology.3. The geographic boundaries and the homogeneous geographic regions respectively for HBV variability of all the genotypes and each genotype have been drawn. These boundaries and regions are basic coincidence with the geographic boundaries of human migration. This indicates that HBV sequence variability are consistent with the genetic structure of their hosts, the particular populations with genetic homogeneity are prone to infect HBV of a certain genotype and subgenotype, and HBV of a certain genotype and subgenotype was prone to prevail and survive in particular populations. |