Font Size: a A A

Analysis Of Multi-dimensional Characteristics Of Hidden Populations Based On Community Mining

Posted on:2018-11-30Degree:MasterType:Thesis
Country:ChinaCandidate:C C LiuFull Text:PDF
GTID:2416330623450741Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
At present,most of hidden population studies are based on field interviews and questionnaires.However,due to the hidden and hard-to-access properties of the targeted population,these methods are usually limited with low efficiency and small sample sizes.Moreover,because of the strong privacy concern,the authenticity of the survey results cannot be guaranteed.There is an urgent need for a more scientific and effective method to study the behavior of hidden population.With the development of the Internet,our social life has undergone tremendous changes,more and more people,including AIDS,MSM(men who have sex with men)and other hidden groups,are accustomed to publishing,delivering and sharing information on a variety of virtual communities.The large number of user data gathered in online communities provide a new way to study the hidden population,which is an effective solution to the difficulties in contacting hidden population.In this study,focusing on AIDS and MSM,the most representative groups of hidden populations in China,we collected data from 36 sub-communities which are HIV-related,MSM-related and news-related,in the largest Chinese online community,Baidu Tieba,to analyze the online activity patterns of hidden population from time,text,emotion,network,community and other dimensions.And through the cooperation with Peking University,we have validated the feasibility and practical application value of online data mining in analyzing hidden populations.This study found that the online activities of HIV population and MSM show obvious characteristics: the online activity of HIV population is more regular,and the topics discussed are mostly about HIV/AIDS.Moreover,online HIV population is also concerned about other HIV/AIDS-themed communities.In comparison,the daily active time of MSM is later,and their online purposes are mostly related to making friends.Moreover,the following of MSM-themed bars in MSM population shows a strong preference,and a few bars have most of the followers.Our further analysis of the HIV/AIDS-related community found that the average similarity of users in the community has a positive correlation with the network efficiency of the community's corresponding interaction network,that is,the more the users interact,the more similar the topics between them.Moreover,the proportions of negative users in the HIV communities are mostly about 60%,and negative emotions occupy a dominant position in the communities.In this study,we have had a multi-perspective visualization of the characteristics of the online activities of the HIV population and the MSM population,by mining the data in HIV-related and MSM-related communities.Furthermore,we have demonstrated the feasibility of mining the multidimensional features of hidden populations through analyzing online data in network communities,which is a great complement to the traditional research methods of the hidden population.
Keywords/Search Tags:hidden population, Baidu Tieba, AIDS, MSM, online features, complex network
PDF Full Text Request
Related items