| Personalized dialog generation is an important branch of open domain dialog generation,which has high research value in improving the quality of sentences generated by open domain dialogue systems.At the same time,it has a wide range of application scenarios in fields such as intelligent assistants and intelligent customer service.Although significant progress has been made in the research on open domain dialogue generation,there are still some important issues:firstly,the existing datasets are small in scale and have high construction costs,which not only fail to fully reflect more personalized phenomena,but also limit the sufficient training of the model;Secondly,existing personalized dialogue models often fail to maintain their personalized information throughout the conversation process,resulting in inconsistent responses that greatly reduce user experience.This thesis focuses on the above issues and,based on a review of existing research,conducts work from three aspects:data collection,model design,and system implementation.The main content includes:Firstly,a data collection platform based on social games has been constructed,which collects data by recruiting volunteers to join the game and creating characters for dialogue.On the one hand,natural dialogue data can be obtained under different character personality settings,on the other hand,the cost of data collection is greatly reduced;Subsequently,personalized information annotation was performed on the collected data,and each utterance in the dataset was annotated with its directly reflected personalized information,which was unprecedented in previous datasets.Providing more data resources for the training of personalized dialogue models.Secondly,a Two Stage Response generating model with Persona(TSRP)based on the pretrained language model is proposed.This model consists of a Speaker Prsona Predictor(SPP),a personalized information compressor,and a response generator;The speaker prsona predictor is used to predict the personalized information that needs to be presented when generating a response.The personalized information compressor is responsible for compressing all the personalized information of the character.The response generator generates smooth,natural,and personalized response statements that match the character’s personalized information by processing the context and the output information of the first two modules.The automatic and manual evaluation on multiple data sets shows that the model proposed in this thesis outperforms the strong baseline model in terms of perplexity,consistency and relevance.Finally,a virtual character social game system based on personalized dialogue is implemented.Users can have conversations with a large number of virtual characters in the system.Virtual characters use the TSRP model proposed in this thesis to generate personalized responses,bringing users a virtual social experience with rich diversity. |