| At this stage,the average sleep time of people is getting later and later,the average effective sleep time does not meet the sleep demand,the proportion of people with sleep disorders diseases is increasing,and the sleep health problem is getting the attention of many researchers.Sleep staging using physiological signals during sleep periods is the basis for sleep quality detection and sleep disease diagnosis,and manual interpretation of sleep periods is laborious and the effectiveness of machine learning methods relies too much on manual feature construction,so the use of deep learning to construct end-to-end automatic sleep staging methods is the trend of research.In this paper,we design classification models for single-channel EEG signals and multimodal physiological signals,respectively.The single-channel model is beneficial to promote the development of portable sleep monitoring device technology,and the multimodal model is beneficial to the value of clinical applications.The main research content is as follows:(1)A Dual-Branch Multi-Scale Feature Sequence Network(DMFSN)with feature channel attention is proposed for single-channel EEG signals,and the DMFSN first uses a dual-branch structure to extract the time-frequency features during single-sleep periods,in each branch,a multi-scale convolution group is used to alleviate the feature difference of the signal samples,the multi-layer convolution with residual is used to obtain the highorder feature representation of the signal,and the feature channel attention mechanism is used to further optimize the feature channel fusion.Secondly,the bidirectional LSTM is used to extract the sequence transformation features of multiple sleep periods,and in the classification stage,the time-frequency features of a single sleep period and the sequence transformation features of multiple sleep periods are combined and sent to the classifier to obtain the final result.DMFSN improved the overall metrics of classification on datasets Sleep-EDF-39 and Sleep-EDF-153 by 3.2% and 4.7% for ACC and 2.9 and 2.5for MF1,respectively,compared to the benchmark model.(2)A Bi-Modal Waveform Segmentation Network(BMWSN)is proposed for EEG and EOG multimodal signals,and the BMWSN uses two sets of "inverted V" encoderdecoder networks with different parameters for EEG and EOG signals respectively to extract higher-order features.In the encoder-decoder network that processes each modality of data,the encoder structure uses four sets of combined structures with mixed convolutional layers and down sampling layers to continuously mine the higher-order features of the signal,in the decoder structure,four groups of combined structures with mixed convolutional layers and up sampling layers are used to restore the original length of the features step by step,and finally two kinds of sample-level features are obtained respectively.Then,the modal attention mechanism is introduced to fuse the features at the sampling point level with weights,and finally the waveform segmentation results at the sampling point level obtained by fusion are converted into classification results of multiple periods.BMWSN improved overall ACC by 2.1% and 1.4% on dataset MASSSS3 and dataset ISRUC-S3,respectively,and MF1 by 1.8 and 1.5,respectively,compared to the other baseline methods. |