Font Size: a A A

Research On Humor Text Recognition And Generation Based On Deep Learning

Posted on:2023-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:H CaoFull Text:PDF
GTID:2568306842468764Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Humor is an indispensable part of human life.Being a tool in social communication,humor enables people to break down barriers and eliminate boundaries of communication.The embarrassment in communication can be resolved and the formation of interpersonal relationships can be promoted through humorous ways,so as to establish a good social relationship.With the development of machine learning and deep learning,natural language processing research has not only made great progress in academia,but also promoted intelligent products with emotion such as "Xiao Ai" and "Microsoft Xiaobing" into people’s daily life in industry.Artificial intelligence will be further realized if computers can be given the ability to understand humor.Therefore,humor calculation has become a promising direction in the field of artificial intelligence.In this paper,a Chinese humor corpus which focus on humor recognition and humor generation tasks is constructed.The main contributions of this paper are as follows:(1)A data annotation process based on Chinese humor theory is proposed to address the lack of Chinese humor datasets.Through the process of corpus collection,screening and labeling,analysis and application,a Chinese humor corpus dataset Humordata is constructed.The dataset with a total of 35,667 humor texts has rich humor types and obvious laugh points,which can be used for general Chinese humor generation tasks.Humor datasets Aindata and AMQ-GANdata are constructed for humor recognition tasks of different task types.The dataset Aindata which is derived from user dialogues in social media can be used for daily dialogue humor recognition tasks,with a total of 43922 texts,including 25605 humor texts.The dataset AMQ-GANdata has obvious semantic association between humor and non-humor texts and has difficulty in recognition.It can be used to test the generalization ability of the humor recognition model,with a total of 6145 texts,including 30727 humor texts.The humor texts in the above datasets are short and easy to understand,which can provide data support for different types of Chinese humor computing tasks.(2)A multi-feature fusion humor text recognition model based on BERT-Text CNN is proposed to achieve the humor recognition task of judging whether a text is a humor text(Multi-Feature Fusion Humor Text Recognition Model Based on BERT-Text CNN,MFF).Word vector representations that are closer to the attributes of humor texts are obtained through the method for fine-tunes different BERT pre-training models.The model’s ability to extract humor features is improved by using Text-CNN network.The model’s ability to recognize humor texts is improved by fusing the global and local features of humor and learning contextual information.The experimental results show that the selfconstructed Chinese humor dataset is 2.54% higher and the public Chinese humor text dataset is 2.67% higher than the current best method.(3)A new humor text generation task is proposed to address the problem of openended humor short text generation: automatic generation of Anti-motivational Quotes.And for this task,a generative adversarial network-based humor text generation model for Antimotivational Quotes is proposed(Anti-Motivational Quotes GAN,AMQ-GAN).By introducing humor templates and contrastive learning methods in the model,the role of the pre-training model is greatly utilized,and the autonomous learning ability of humor’s characteristics is improved.By adding a semantic feature discriminator and a humor feature discriminator in the adversarial generation network,the subject constraints of the text,fluency of writing and requirements of humor are better considered.In addition,according to the characteristics of the humor generation task,a manual evaluation index including semantic fluency,theme consistency,and humor level is proposed in this paper.The experimental results show that the model is better than the current best methods,with 30.4%in BLEU index and 20.6% in manual-evaluated humorous text ratio.
Keywords/Search Tags:Chinese Humor Corpus, Humor Generation, Humor Recognition
PDF Full Text Request
Related items