| In recent years, with the rapid development of Earth observation technologies, the launch of (very) high-resolution optical remote sensing satellites has resulted in a tremendous growth in the volume of data, thus raising new challenges for the fast processing for the follow-up time-critical events, such as moving target detection and natural disaster rescue. Moreover, the limited space and resource in many military and disaster responses are in urgent need of a compact, portable, and hardware-software coupling high performance processing system. The current satellite data processing systems and algorithms emphasize the quality and accuracy of the processing results, without any consideration for computational efficiency, space or power consumption, making them unable to meet the above requirements. Therefore, it has become a major scientific issue that by effectively using the new high-performance processing measures to resolve fast processing problems for massive optical satellite data.At present, the general purpose computing hardware such as Graphics Processing Unit (GPU) has become a major solution for big data real-time computing. GPU is regarded as a high performance, small size, low power consumption, and hardware-software coupling hardware device, making it capable of providing high-performance computing capabilities in a limited environment and thus is widely applied in surveying /mapping, remote sensing and earth science. This dissertation systematically studies the theory and methods of high performance optical satellite data processing in a CPU/GPU heterogeneous environment, including a high efficient GPU mapping method for the bottleneck algorithms, a CPU/GPU dynamic cooperation model, and an allocation and dispatch architecture for CPU/GPU computing resources, etc. The ultimate object is to provide a solution for the fast processing of massive satellite remote sensing data. The contents and innovations are as follows:1) High-efficient GPU mapping of optical satellite data processing algorithmsThe calculation amount and the degree of parallelism of optical satellite data processing algorithms are firstly profiled. Then the algorithms with large computation amounts and high degrees of parallism, i.e., image restoration (MTF compensation as an example), sensor correction (including band registration and CCD stitching) and geocorrection, are mapped to GPUs for execution. Here we follow the principle of gradual improvement. Firstly, kernel arrangement and initial settings are determined in the early stage for basic GPU implementation. Then three optimization measures, i.e., maximizing memory throughput (including multiple elements allocation for one thread, block size and shape optimization, and memory hierarchical access), optimizing flow control instructions, and overlapping data transfer and kernel execution, are taken to further improve the performance for high-efficient GPU mapping.2) CPU/GPU high performance dynamic cooperation of optical satellite dataTo fully utilize the CPU cores in a CPU/GPU heterogeneous system, we summarize the three cooperative approaches for CPU cores and GPU and then divide the CPU cores into three categories, namely, the control cores, the cooperative computing cores (CCCs) and the standalone computing cores (SCCs), with respect to the characteristics of optical satellite data processing chain. The CCCs are used to cooperatively compute partial workloads with the GPU. In this regard, workloads should be appropriately distributed between the CCCs and the GPU to achieve peak performance. We derived the equation for the workload distribution so that the CCCs and the GPU can accomplish the corresponding workload within the same amount of time. The SCCs are applied to execute RPC generation in parallel with MTF compensation and band register to hide its runtime. Furthermore, a dynamic reallocation strategy is proposed so that the optimal performance of the processing chain can be achieved.3) High performance processing architecture and optimal resource dispatch of opcital satellite dataTo take full advantage of various types of processing resources in multi-CPU/GPU heterogeneous system, we propose an idea of "Atomic Computing Resources (ACRs)" and present two data processing patterns for multiple data processing missions, namely, the cooperative pattern and independent pattern. Their advantages and disadvantages are discussed. Then an MOC (Multi-computing resource/Open Multi-Processing (OpenMP) /Compute Unified Device Architecture (CUDA))-based data processing architecture is proposed for multi-ACR environments. In the MOC architecture, the GPU mapping and CPU/GPU dynamic cooperation within an ACR are implemented by CUDA and OpenMP. The dispatch and scheduling between multiple ACRs is realized by Portable Batch System (PBS). Then the PBS dispatch method for ACR is designed and several dispatch strategies are compared to achieve the goal of the optimal resource dispatch. In addition, a three-layer data Input/Output (I/O) structure, which consists of Ramdisk, Solid State Disk (SSD) and Redundant Array of Independent Disks (RAID), is designed for high performance I/O with regard to the I/O features of optical satellite data processing to achieve the optimal performance when multiple images/data are processed.The final experimental results show that the methods proposed in this dissertation could significantly improve the efficiency of optical satellite data processing chain and may provide near-real-time response for the time-critical applications that follow. Based on the above discussion, an optical satellite data processing experimental system is designed regarding the demands of practical applications. Its design principle, hardware/software architecture and implemtation details are introduced, followed by a presentation of its use in ZY-3 satellite emergency processing system.The following research interests include:(1) The design of an on-board high-performance processing architecture and platform for the minute/second-level fast processing of the satellite processing algotithms; the validation of the adaptability of the hierarchical high performance architecture and resource allocation/dispatch strategy for on-board satellite processing; (2) The investigation of new features of Kepler architecture GPUs, such as dynamic parallelism and hyper-Q, to fully utilize the computing performance of them and to achieve the goal of optimal mapping of various satellite data processing algorithms; (3) The application of the new high performance computing hardwares such as many-core CPU and APU for the real-time processing of massive satellite remote sensing data with multi-core CPU and GPU, etc. |