Research On Logic Rule Augmented Chinese Medicine Entity Representation Learning And Application

Posted on:2023-09-29

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Sun

Full Text:PDF

GTID:2544307031955049

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Traditional Chinese Medicine(TCM)classics are the essence of TCM,and text vectorization is the basic work of TCM text processing tasks.High-quality,feature information-rich vector representation can guarantee the quality of entity recognition and other downstream tasks in the TCM field at the source,which is important for the intelligent learning and application of TCM texts.The deep pre-trained representation model BERT generates a vector representation rich in semantic and syntactic information by the superposition of multi-layer feature extractors,but the model has a large parameter size,which in turn leads to a large data size required.Meanwhile,although the shallow neural network model CBOW has a simple structure,it treats the words in a sentence equally,thus ignoring the semantic information represented by different components within the sentence and the word order information inherent to the sentence.To construct a lightweight word representation model with low computational complexity and to make the word vector retain rich feature information of TCM text,the main work is as follows:1)According to the "verb-core structure" theory,the language logic of Chinese medical texts is studied from two aspects of textual pattern and words.Using the fixed sentence structure with verbs as the core and different grammatical logic in sentences,nine sentence meaning representation rules with verbs as the core and classification criteria for words with different grammatical logic are formulated,thus enhancing the ability of word representation model to extract semantic features of TCM texts.2)To address the problem that the shallow word representation model CBOW is weak in extracting semantic and syntactic features of text,an enhanced word representation model based on semantic logic rules and incorporating syntactic logic such as lexicality and word order is proposed.For verb-centered words,the semantic information of the sentence is extracted by matching the sentence structure using the syntactic representation rules;For non-verb-centered words,the different semantic contributions of different words to the sentence meaning are used to strengthen the role of strong syntactic logical words in the word vector generation process.Then word order features are extracted by convolutional operations.Synonyms,antonyms and analogous word lists are introduced in the word vector generation stage to further enhance the characterization effect of word vectors on relevant semantic information.3)Several groups of experiments were conducted in both intrinsic similarity analysis and extrinsic quantitative comparison.The experimental results show that the proposed logic rule enhancement model achieved better results in both semantic similarity analysis and entity recognition tasks.In the entity recognition task,the F1 value is improved by4.66 percentage points compared to the traditional CBOW model.The model is a lightweight word representation model,which reduces the training time by 51%compared to the BERT,and has more advantages in terms of resource usage.Figure 26;Table 16;Reference 63...

Keywords/Search Tags:

verb-core structure, TCM text, semantic logic rule, syntactic information, lightweight word representation model

PDF Full Text Request

Related items

1	Structured Processing Of Medical Ultrasound Text Data Based On Semantic Dependency Analysis
2	Research On Text Classification Of TCM Nephropathy Based On Key Semantic Information
3	Research On Lightweight Medical Representation Recognition Algorithm And Application Based On Image Semantic Segmentation
4	Design And Implementation Of Disease Analysis System Based On LDA Topic Model
5	Enriching Biomedical Relationship Extraction With Dependency Syntactic Information
6	Construction And Application Of Consumer Health Information Need Model
7	Text Analysis And Application Of Lung Diseases Based On BERT Semantic Embedding
8	Research On Lightweight Semantic Segmentation Networks For Mobile Devices And Application In The Assistance Of The Visually Impaired
9	Research And Implementation Of Structuring Processing Approach For Medical Semantic Understanding
10	Research Of Intelligent Hepatopathy Auxiliary Diagnosis System Based On Text Semantic Analysis Of Electronic Medical Records