Research On Natural Language Generation Technology For Electronic Commerce

Posted on:2024-09-27

Degree:Master

Type:Thesis

Country:China

Candidate:L Q Jing

Full Text:PDF

GTID:2568306923952419

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

This thesis mainly studies two key natural language generation tasks in e-commerce:multimodal product summary generation task and stylized product data-to-text generation task.The former aims to generate a short summary for the given multimodal product information(long text description and image)so that customers can understand the characteristics of the product in a short time.The latter aims to generate a coherent text in accordance with the specified style for the given nonlinguistic data,so as to adapt to different application scenarios.These two tasks can be applied in different scenarios of e-commerce platforms,to provide highquality natural language generation service for e-commerce.Although existing methods have achieved great success,they still suffer from three key limitations:1)previous methods follow the conventional train-from-scratch paradigm and fail to take advantage of the pre-training technique,2)existing methods mainly focus on the overall output-level supervision but lack representation-level supervision and 3)previous methods ignore the disturbance to the model caused by the various seller-generated product descriptions.To address these limitations,this thesis proposes a Vision-to-Prompt based Multimodal product summary generation framework,dubbed V2P,where a generative pre-trained language Model is adopted as the backbone.We design V2P with two key components:vision-based prominent attribute prediction,and attribute prompt-guided summary generation.The first works on obtaining the vital semantic attributes of the product from its image by the Swin Transformer,while the second aims to generate the summary based on the product’s long text description and the attribute prompts yielded by the first component with a GPLM.Towards comprehensive supervision,apart from the conventional output-level supervision,we introduce the representation-level regularization.Meanwhile,we design the data augmentation-based robustness regularization to handle the diverse inputs and improve the model robustness.Extensive experiments on a large-scale Chinese dataset verify the superiority of our model over cutting-edge methods.Considering that different application scenarios may require different styles of text,different from the traditional generalization data-to-text generation task,this thesis proposes a new stylized data-to-text generation task.This task is non-trivial,due to three challenges:1)how to ensure the logic of the generated text.2)how to accurately obtain style information without content information from unstructured style reference text,and 3)how to achieve unbiased stylized generation.To address these challenges,we propose a novel stylized data-totext generation model,named StyleD2T,comprising three components:logic planningenhanced data embedding,mask-based style embedding,and unbiased stylized text generation.First,we introduce a graph-guided logic planner for attribute organization to ensure the logic of the generated text.Second,we devise mask-based style embedding to extract the essential style signal from the given unstructured style reference.In the last one,pseudo triplet augmentation is utilized to achieve unbiased text generation,and a multi-condition-based confidence assignment function is designed to ensure the quality of pseudo samples.This thesis collected a dataset containing 31,728 samples based on Taobao and extensive experiments on the dataset show the superiority of the proposed model over existing methods.

Keywords/Search Tags:

Electronic commerce, Natural Language Generation, Multimodal Product Summarization Generation, Stylized Data-to-Text Generation

PDF Full Text Request

Related items

1	Text Summerzatio Generation Research Based On Multimodal Data
2	Automatic Generation Of Agricultural Product Advertising Copy Based On Controlled Text Generation Technology
3	Research On Key Technologies Of Text Generation In Social Media
4	Research And Application Of End-to-end Chinese Text Generation
5	Research On Sequence-to-Text Inference And Generation Based On Matching And Transformation
6	Automatic Commentary Generation For Snooker Game Videos Based On Deep Learning
7	Research On Cross-Modal Natural Language Generation
8	The Research On Multi-document Summarization Generation Method Based On Text Relation Graph
9	Research On Semantic Text Exchange Method Based On Pre-trained BART Language Model
10	Research On Automatic Generation Of A-Share Review Text Based On Trend Word Discovery