Research On Text-based CAPTCHAs

Posted on:2020-11-19

Degree:Master

Type:Thesis

Country:China

Candidate:C X Tian

Full Text:PDF

GTID:2428330590982226

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

CAPTCHAs are widely used in the login and registration of websites to enhance authentication and prevent automatically attacks from computer programs.The text-based CAPTCHAs are used by most mainstream website for its large password space and simple interaction mode.At present,in order to increase the difficulty of automatic recognition by computer programs,text-based CAPTCHAs generally using a random combination of different security features,such as complexity obstacle backgrounds,characters warp,rotate and overlap.Due to the combination of multiple security features,the recognition rate of traditional CAPTCHAs identification methods is very low or even invalid.Address this challenge,we propose a de-interference method based on Generative Adversarial Networks(GAN)to generate non-interference CAPTCHAs,and then design three identification schemes on the basis of different characteristics of the CAPTCHAs.Three identification schemes are summarized as follows:(1)For the hollow character CAPTCHAs,after using the de-interference method,the hollow characters turn into solid characters and the character spacing is stretched.Based on the observation,we propose a GAN-based segmentation identification method to segment the stretched CAPTCHAs effectively,and then the single character after segmentation is identified by the Convolutional Neural Network(CNN).(2)For the solid character CAPTCHAs,after using the de-interference method,we propose a transfer-learning-based identification method.First,we generate a mass of synthetic CAPTCHAs based on the text distribution features,and act them as training samples to train a CNN model.After that,we use several real CAPTCHAs to transfer training based on the pre-training model.In the process of transfer,the parameters of the first two layers of the pre-training model remain unchanged,and the parameters of other layers are updated by conduction.Finally,the transfer model is used to predict the real CAPTCHAs.(3)For the CAPTCHAs with solid characters and common word fragments splicing as text content,after using the de-interference method,we propose a modified-model-based identification method.First,we employ synthetic CAPTCHAs as training samples to train a CNN model,and use this recognition model with a small amount real CAPTCHAs to predict results.Then,the predicted results and real results are trained into a modified model using the Natural Language Processing(NLP)domain spelling correction method.Finally,we leverage the modified model to correct the results predicted by the identification model.In addition,for it is difficult to obtain a large number of real CAPTCHAs at a low cost.This paper designs a program to simulate these real CAPTCHAs for network training,and the training cost is far lower than other existing methods and the training effects are the same.Extensive experimental results demonstrate that the method proposed in this paper can successfully identify the CAPTCHAs catch from some famous websites,such as Microsoft,Wikipedia,Baidu,Alipay,Sina,and so on.In the best case,the recognition accuracy of our method could be 63.7% higher than traditional methods.

Keywords/Search Tags:

Text-based CAPTCHAs, CAPTCHAs identification, Generative adversarial networks, Transfer Learning

PDF Full Text Request

Related items

1	Research On Text CAPTCHAs’ Security Based On Adversarial Analysis
2	Design And Security Analysis Of New CAPTCHAs Based On Image Style Transfer
3	Research And Implementation Of The Key Technology On Complex Text-based CAPTCHAs Automatic Recognition
4	Research On Recognition Of Clicked Chinese Captchas Based On YOLO V2
5	Enhancing cyber security through the use of synthetic handwritten CAPTCHAs
6	Research Of Text-Based Captchaattack Model On Skeleton Points Segmentation Algorithm
7	An End-to-end Attack On Text-based CAPTCHAs
8	Research On Captchas Recognition Based On Convolutional Neural Network
9	Improvement And Application Of Generative Adversarial Networks Algorithm Based On Transfer Learning
10	Cross Domain Person Re-identification Based On Generative Adversarial Network