Font Size: a A A

Research And Implementation Of A Local Checkpoint Mechanism In On-board Computing

Posted on:2012-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:C Y ZhangFull Text:PDF
GTID:2212330362960510Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As the space application develops, on-board computing imposes much more rigid requirements on real-time performance and reliability. On-board computer, running in a environment with strong radiation of cosmic rays, is vulnerable to the impact of high-energy charged particles and suffering from the risk of Single Event Effects (SEE), among which, the Single Event Up (SEU) is the most usual and widespread one. SEU will cause the potential in electronic units to overturn, hence damage the normal computation or execution of a task and even lead to a system collapse. However, SEU is a transient fault, and it can be omitted by rewriting the right value into the fault bit.In a general way, on-board system will reload the task or take a reboot to tackle the SEU effect. The nature of these methods is re-executing the application task, which not only consumes system resources, but also destroys the real-time property of tasks. Thus, a fault processing mechanism which can ensure a high real-time property and reliability at the same time is a necessity for the development of future on-board computers.With the in-depth demonstration of the necessity of reliability in on-board computing, and comprehensive research on current checkpoint techniques and fault tolerance techniques, especially the soft reinforcement techniques implemented by software-based fault tolerance, a local checkpoint model (LCM), which concerns more about the real-time demands, is proposed to deal with the transient faults for on-board tasks. Compared with the traditional checkpoint techniques, LCM has two advantages: 1st, segmental rollback optimizes the content and reservation of checkpoints; 2nd, the setup of checkpoints is based on the user conduction, increasing the flexibility as well as ensuring the reasonability of segmenting. Then, according to LCM, a local checkpoint mechanism (LCMech) in VxWorks operating system is implemented. The result of fault injection experiments shows that, LCMech outperforms a lot in the executing efficiency. As to a task with large data processing, the executing time of LCMech is 30%~50% shorter than other mechanisms, and the space consumption is smaller than traditional checkpoint mechanism for more than one order of magnitude. At last, a hierarchical fault processing mechanism on VxWorks is designed, using LCMech as the core. This mechanism overcomes the shortcoming that LCMech can only process transient faults, largely improving the fault tolerance of on-board systems. The simulation experiments testify that, the hierarchical fault processing mechanism based on LCMech has efficiently processed most kinds of SEE and largely increased the reliability of on-board computers.
Keywords/Search Tags:On-board Computing, Transient Fault, Fault Processing, Local Checkpoint, Segmental Rollback, VxWorks
PDF Full Text Request
Related items