Font Size: a A A

Reliability Evaluation And Fault Mitigation For Soft Errors In SRAM-based FPGAs

Posted on:2013-05-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:N F JingFull Text:PDF
GTID:1228330392951894Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Computing reliability has been one of the key problems for VLSI designs andapplications. With the continuous advancing in VLSI manufacturing technology andever increasing integrity in VLSI designs, the soft error issue due to radiation fromhigh energy particles and circuit internal noise begins to emerge in the FPGA-basedapplications, which can cause significant reliability degradation. Aiming for morereliable computing, enhancing the robustness of FPGA devices and systems withrespect to the soft error problem is already a challenge for both the FPGA designersand developers.Soft error is known as the phenomena of a random memory status change in astatic memory cell. Reliability evaluation and fault mitigation for soft errors isgenerally acknowledged as two key problems in FPGA designs. However, traditionalevaluation of FPGA reliability is based on field testing on fabricated devices usingaccelerated particles, which is already too late to change the design if the reliabilityrequirements are not met. At the same time, current method for design time reliabilityevaluation does not cover all structures in FPGA and lacks of a unified metric for aquantitative evaluation. To this problem, after detailed studies on the mechanism ofsoft errors and modeling of their behaviors from the perspective of FPGA micro-architectures in chapter III, this dissertation first proposes a unified metric ofcriticality to evaluate the sensitivity of soft errors towards failures. Then, an exclusiveevaluating framework for FPGA soft errors is presented, which can effectivelyanalyze the soft error effect as early as possible during the design phase. Theframework has demonstrated its practical use, such as identifying sensitive circuitelements, and providing insightful guidance to FPGA designs and applications againstsoft errors.Based on the presented framework above, techniques for soft error mitigation onFPGA devices are further investigated in this dissertation. Targeting on mitigation with reasonable overhead on circuit elements that cause higher reliability degradation,this dissertation commences from the studies on soft error occurrence, propagationand manifesting. Motivated from the micro-architecture and inherent redundancy inFPGA, this dissertation for the first time proposes two mitigating algorithms targetingon soft errors in routing and logic resources, respectively. Different from traditionalmitigating techniques which involve higher overhead in area, power and performance,the proposed methods apply in-place reassignment techniques, preserve detailedFPGA placement and routing, and thus incur minimum overhead. The experimentresults have demonstrated significant reductions on soft error induced circuit failuresrates. Our methods are highly compatible to other mitigating methods, and help toguarantee the tight design closure in FPGA design.In addition, the widely used Triple Modular Redundancy (TMR) technique onsoft error mitigation is also studied in this dissertation. To estimate the reliability of aTMR system, this dissertation applies combinational probabilistic analysis with themetric of criticality, and for the first time provides TMR reliability estimation bytaking the inherent circuit logic masking capability into account. This study will beuseful towards a moderate FPGA design against the soft error issue. At the same time,this dissertation integrates the unified evaluating metric and mitigating methods into acomprehensive and automatic optimization framework, which greatly helps to the softerror evaluation and mitigation problem in FPGA.The evaluation and mitigation on soft errors in FPGA-based systems is becominga concerning issue for both academia and industry. The proposed soft error evaluationframework and mitigating techniques have demonstrated their effectiveness, andprovided theoretical significance and practical applicability to the critical reliabilityissue in FPGAs.
Keywords/Search Tags:FPGA, soft error, configuration, Single Event Upset, soft errorevaluation, soft error mitigation, in-place mitigation, TMR
PDF Full Text Request
Related items