| In recent years, online social networks(OSNs), such as Facebook, Twitter and Sina Weibo, have become extremely popular among Internet users. Unfortunately, attackers also utilize them to hide malicious attacks. Due to the significance of detecting malicious URLs in OSNs, multiple solutions have been offered by OSN operators, security companies, and academic researchers.There mainly exists two aspects for detection, which focus on malicious URLs and spammers. For URL detection, most of these solutions use machine learning methods to train classification models based on different kinds of feature sets. However, most are ineffective because their selected features are conventional. For spam detection, existing anti-spam methods focus only on single message or account for each detection, and most algorithms rely on the connection relation of accounts in various social graphs. Using such an approach may result in repeat detection in the same account. Further, it is difficult to effectively identify and completely clean these hidden spam crowds. Therefore, it is necessary to merge spam messages sent from the same hidden crowds together, then clean these accounts in crowds at one time rather than individually.In this paper, for malicious URLs detection, we focus on forwarding-based features because of the special connections between forwarding behavior and the propagation of malicious URLs. We evaluate the system using about 100,000 original messages collected from Sina Weibo, which is the largest OSN website in China. The accuracy rate is about 83.21%, and the false positive rate is about 10.3%. Through the results from comparison, it shows the effective of forwarding based features in detecting malicious URLs.For spammer detection, we introduce a forwarding message tree to combine accounts based on the relation among their sent messages. Our approach here should clearly expose inner relations among hidden spam crowds and is convenient for deleting spammers using this crowd-based approach. First, we analyze the forwarding tree and discover six effective features based on the forwarding layer relation, propagation scope, repeated forwarding behavior, propagation speed, and an average weight of the tree. Next, we illustrate the effectiveness of these features by incorporating them into machine learning algorithms. Through an evaluation on a real dataset collecting from Sina Weibo, the high accuracy rate is approximately 95.3%, whereas the low false positive rate is approximately 0.5%. Finally, we also identify a gain rank on features, with most of the discovered features listed at the top. To the best of our knowledge, this work is the first to analyze forwarding-based features and introduce forwarding message tree in OSNs, it offers a valuable contribution to this area of research. |