A Fault Tolerance Technique for Combinational Circuits Based on Selective-Transistor Redundancy

Abstract:

With fabrication technology reaching nano-levels, systems are becoming more prone to manufacturing defects with higher susceptibility to soft errors. This paper is focused on designing combinational circuits for soft error tolerance with minimal area overhead. The idea is based on analyzing random pattern testability of faults in a circuit and protecting sensitive transistors, whose soft error detection probability is relatively high, until desired circuit reliability is achieved or a given area overhead constraint is met. Transistors are protected based on duplicating and sizing a subset of transistors necessary for providing the protection. In addition to that, a novel gate-level reliability evaluation technique is proposed that provides similar results to reliability evaluation at the transistor level (using SPICE) with the orders of magnitude reduction in CPU time. LGSynth’91 benchmark circuits are used to evaluate the proposed algorithm. Simulation results show that the proposed algorithm achieves better reliability than other transistor sizing-based techniques and the triple modular redundancy technique with significantly lower area overhead for 130-nm process technology at a ground level. The proposed architecture of this paper analysis the logic size, area and power consumption using Tanner tool.

Existing System:

Reliability in systems can be achieved by redundancy. Redundancy can be added at the module level, gate level, transistor level, or even at the software level. Design of reliable systems by using redundant unreliable components was proposed. Since then, plethora of research has been done to rectify soft errors in combinational and sequential circuits by applying hardware redundancy. Triple modular redundancy (TMR), a popular and widely used technique, creates three identical copies of the system and combines their outputs using a majority voter. The generalized modular redundancy scheme considers the probability of occurrence of each combination at the output of a circuit. The redundancy is then added to only protect those combinations that have high probability of occurrence, while the remaining combinations are left unprotected to save area. El-Maleh and Al-Qahtani proposed a fault tolerance technique for sequential circuits that enhances the reliability of sequential circuits by introducing redundant equivalent states for states with high probability of occurrence. Mohanram and Touba proposed a partial error masking scheme based on TMR, which targets the nodes with the highest soft error susceptibility. Two reduction heuristics are used to reduce soft error failure rate, namely, cluster sharing reduction and dominant value reduction. Instead of triplicating the whole logic as in TMR, only those nodes with high soft error susceptibility are triplicated; the rest of the nodes are clustered and shared a the triplicated logic. Sensitive gates are duplicated and their outputs are connected together.

Physically placing the two gates with a sufficient distance reduces the probability of having the two gates hit by a particle strike simultaneously and, therefore, reduces the soft error rate (SER). Another technique based on TMR maintains a history index of correct computation module to select the correct result. Teifel proposed a double/dual modular redundancy (DMR) scheme that utilizes voting and self-voting circuits to mitigate the effects of SETs in digital integrated circuits. The Bayesian detection technique from communication theory has been applied to the voter in DMR, called soft NMR. In most cases, it is able to identify the correct output even if all redundant modules are in error, but at the expense of very high area overhead cost of the voters.

Another class of techniques enhances fault tolerance by increasing soft error masking based on modifying the structure of the circuit by addition and/or removal of redundant wires or by resynthesizing parts of the circuit. SER is reduced based on redundancy addition and removal of wires. Redundant wires are added based on the existing implications between a pair of nodes in the circuit. Two-level circuits are synthesized by assigning do not care conditions to improve input error resilience, which minimizes the propagation of fault effects. An algorithm is proposed to synthesize two-level circuits to maximize logical masking utilizing input pattern probabilities.

Disadvantages:

  • Area coverage is High

Proposed System:

Now, consider the transistor arrangement shown in Fig. 2(a) where duplicate pMOS transistors are connected in parallel. The width of the redundant transistors must also be increased to allow dissipation (sinking) of the deposited charge as quickly as it is deposited, so that the transient does not achieve sufficient magnitude and duration to propagate to the output. If the output is currently high and an energetic particle hits the drain N1 of the nMOS transistor (with the same current source used in the simulations shown in Fig. 1), this should result in a lowered voltage observed at the output. But, due to the employed transistor configuration, the net negative voltage effect will be compensated, as evident from Fig. 2(b), resulting in a spike that has lesser magnitude as compared with the one shown in Fig. 1(b). The spike magnitude is reduced due to increased output capacitance and reduced resistance between the Vdd and the output.

Figure 1: Effect of energetic particle strike on CMOS inverter att =5 ns. (a) Particle strike model. (b) Effect of particle strike at nMOS drain. (c) Effect of particle strike at pMOS drain

Consider another arrangement of transistors in Fig. 2(c) where redundant nMOS transistors are connected in parallel. If the output is low and the incident energetic particle strikes the drain P1 of pMOS transistor, then the raised voltage effect at the output shown in Fig. 1(c) will be reduced, as shown in Fig. 2(d). This reduction in the spike magnitude is due to the same reasons mentioned for the nMOS transistor. Similarly, to protect from both sa0 and sa1 faults, the transistor structures in Fig. 2(a) and (b) can be combined to fully protect the NOT gate. A fully protected NOT gate offers the best hardening by design, but at the cost of higher area overhead and power. It must be noted that the optimal size of the transistor for SEU immunity depends on the charge Qof the incident energetic particle.

Figure 2: Proposed protection schemes and theireffect. (a) Particle hit at nMOS drain, OUT=HIGH. (b) Reduced effect of particle strike at nMOS drain. (c) Particle hit at pMOS drain, OUT=LOW. (d) Reduced effect of particle strike at pMOS drain.

PROPOSED ALGORITHM:

The proposed STR algorithm is presented. The algorithm protects sensitive transistors whose probability of failure (POF) is relatively high. The proposed algorithm can be utilized in two capacities: 1) apply protection until the POF of circuit reaches a certain threshold and 2) apply protection until certain area overhead constraint is met. We will first discuss different relations that realize the circuit POF. These relations are then used in the proposed algorithm.

Advantages:

  • Area coverage is low

Software implementation:

  • Tanner tool