| With the rapid growth of economic globalization and civil aviation transportation in China,the significant increase in air traffic volume has led to issues such as congested airspace resources and mismatched capacity,posing great challenges to air traffic control(ATC)safety.The traditional ”human-centric” ATC model is becoming inadequate to meet the demands of the rapidly growing aviation transportation.There is an urgent need to introduce new technologies to upgrade the level of the ATC automation,promoting the shift from ”human control”to ”automated control” in the ATC domain,thus enhancing both safety and efficiency.Accurate situational awareness of the airspace is the fundamental and essential prerequisite for safe and efficient ATC works.Hence,research into intelligent situational awareness methods becomes a key component in advancing the development of ”automatic control”.However,due to the dynamic and complex nature of the ATC system,situational awareness involves the processing and fusion of multi-modal and heterogeneous data.Existing ATC automation systems face challenges in processing,integrating,and fusing the multi-modal situational data,leaving a substantial gap in intelligent situational awareness research tailored for ATC scenarios.Based on the analysis of critical elements of situational awareness in the ATC domain,this work focuses on the intelligent processing of two main determinants of ATC situations:radiotelephony communications and flight trajectories.It aims to systematically conduct radiotelephony communication-based control decision perception and flight trajectory perception technologies by addressing three key technological challenges,including ATC prior knowledge mining and leveraging,information extracting of nonstructural ATC data,and data representation and multi-modal fusion.This research aims to provide technical support for intelligent applications,such as situational awareness enhancement for ATC controllers,decision-making assistance systems,and safety monitoring,etc.In summary,the main contributions of this work are as follows:1.A contextual automatic speech recognition(ASR)model in the ATC domain,called CATCNet,is proposed,which is able to improve the recognition accuracy of critical named entities for ASR tasks.Commonly used named entities in radiotelephony communications,derived from context,are introduced as prior knowledge and incorporated into the CATCNet.A contextual attention mechanism is proposed to integrate contextual knowledge and acoustic features in a deep fusion manner,enabling the model to bias the output probabilities by considering contextual knowledge.Additionally,to cope with the domain specificities like high speech rates and unstable noise,a dynamic convolutional based acoustic feature extraction module is proposed,which is able to enhance the robustness and generalization of the model.Experimental results on real-world ATC speech corpora demonstrate that CATCNet accurately recognizes key elements and significantly improves the recognition accuracy of named entities in the ATC speech compared to traditional methods,achieving 3.90% character error rate and 86.54% instruction accuracy.2.A radiotelephony dialogue tracking framework based on speaker role identification(SRI)is proposed to address challenges in dialogue management within ”multi-person and multi-turn” communication environments.Specifically,we formulate the SRI task as a classification problem and systematically investigate this task using multi-modal data,including text,speech,and speech-text modalities.Most importantly,a multi-modal SRI model,called MMSRINet,is introduced to enhance model robustness and scene adaptation by leveraging complementary features between speech and text data.Experimental results on real-world datasets demonstrate that MMSRINet achieves 98.56% SRI accuracy and strong robustness,even on previously unseen communication channels.Furthermore,the proposed dialogue tracking method is experimentally verified on real data,achieving a dialogue tracking accuracy of 91.13%.3.To overcome limitations in conventional flight trajectory prediction(FTP)models caused by low-dimensional representations and normalization-dependent algorithms of trajectory data,a binary encoding representation based FTP framework,called Flight BERT,is proposed.By encoding decimal-valued trajectory attributes into binary vectors,Flight BERT expands the initial dimensions while removing the reliance on data normalization algorithms of the model inputs.Additionally,the FTP task is transformed into a multi-binary classification prediction paradigm,instead of traditional regression prediction paradigm.An attribute correlation attention module is introduced based on kinematic prior knowledge to enhance the performance of the Flight BERT framework by mining the inherent correlations among different attribute sequences.Comparative validation against traditional FTP methods on a large-scale real-world dataset demonstrates the significant performance improvements achieved by the Flight BERT framework.Under a time resolution of 20 seconds,the mean absolute error in latitude,longitude,and altitude attributes is reduced to 0.0039?,0.0033?,and 13.6 meters,respectively.4.An enhanced non-autoregressive multi-horizon FTP framework,Flight BERT++,is proposed to address the issue of error accumulation and inadequate inference efficiency in multihorizon FTP tasks while overcoming outlier problems in Flight BERT.A generalized EncoderDecoder architecture is introduced to implement the Flight BERT++ framework.In addition,a horizon-aware context generator(HACG)is designed to generate multi-horizon contexts in a single inference pass,leveraging prior knowledge about prediction horizons to mitigate the large error accumulation problem in iterative/autoregressive multi-horizon FTP tasks.Furthermore,with the design of HACG and the incorporation of the Transformer blocks for temporal modeling,the model is able to perform multi-horizon FTP in a non-autoregressive manner.Experimental results indicate that Flight BERT++ achieves significant accuracy improvement and higher inference efficiency in multi-horizon FTP tasks compared to competitive baseline methods.In the multi-time step inference scenario over a 5-minute duration,the mean distance error is reduced to 1.97 kilometers,with an mean time costs of only 6.81 milliseconds.5.A FTP framework,called SIA-FTP,is proposed to address the decreased prediction accuracy and delayed response issues of traditional methods in high-maneuver flight scenarios driven by ATC instructions.SIA-FTP integrates text instructions into the FTP model,enabling the model to perceive flight intentions from ATC instructions and achieve high-precision FTP.To bridge the heterogeneity gap between trajectory and instruction text,a three-stage multimodal fusion learning paradigm is designed,constructing a multi-modal FTP model through pre-training and joint training.In order to extract intent embedding from textual instruction,a BERT model pretraining mechanism is introduced to obtain universal text representations in an unsupervised manner.Additionally,a multi-label classification-based instruction intent identification method is designed to further guide the model in extracting high-dimensional features by specific task.Experimental results show that the proposed SIA-FTP framework achieves accurate prediction of flight trajectories based on ATC instructions in high-maneuver flight scenarios,reaching 2.26 kilometers in mean distance error metric,providing strong support for applications like conflict detection and decision-making.In conclusion,this work comprehensively constructs intelligent processing methods for radiotelephony communications,flight trajectories,and other data.The effectiveness of proposed methods is validated through experiments,providing robust technical support and practical basis for automatic real-time situation awareness tasks in ATC domain. |