Results

This document presents the contributions of each WP of the RELIASIC project. The last section gives the conclusions and the perspectives for future works.

I. WP1: Error model

Objective: In this work package we characterize the impact of lowering the power on logic gates in a 28 nm CMOS technology.

1. Results

The study began with a state of the art in component failures in deep submicron technologies. Thus it has been shown that size reduction and ageing lead to variations in the threshold voltage of MOS transistors. These variations are due to the reduction in insulator thickness, dopant concentration variations, or Negative Bias Temperature Instability (NBTI). In addition to these technological factors, there are other sources of noise such as RTN (Random Telegraphic Noise). It comes from the capture or emission of charge carriers in potential traps at the channel/oxide interface. Electrical simulations were carried out to model these effects on the MOS transistors of ST microelectronics’ 65nm and 28 nm technologies. The methodology consists of performing Monte-Carlo simulations on minimum size transistors. Ten thousand cases were simulated to show the variation of the threshold voltage of each single transistor (NMOS and PMOS). Next, to model the variation of the VTH in a circuit (inverter for instance), voltage sources are added in serial in front of the gate pin of the transistors. Others DC sweep simulations are then performed to find the DVTH that leads to dysfunction. The dysfunction can be, for instance, the output of an inverter that is still low even when the input changed from High to Low state. The value of Vth that keeps the output at Low State is then compared to its occurrence in Monte-Carlo simulations. It has been shown that it takes a large variation of the threshold voltage to obtain a faulty behavior of the gate. Thus for a supply voltage of 500mV, a DVTH of about – 400mV is required for the PMOS to have the output of the inverter stuck at 0. In that case the probability of failure is 10-14.

Simulations performed by modifying the bulk voltage of the transistors also failed to show a significant degradation of this probability. However, for supply voltages below 300mV the gates no longer work. Simulations of NAND, NOR and 1-bit adder gates led to the same conclusions.

2. Hiring employees (IMT-Atlantique)

Trainee student: Kevin Caradec (4 months 2015)

Post-doct: Benoît Larras (2 months in 2016)

II. WP2: ROBUSTIFICATION OF THE DESIGN

Objective: In this work package, we propose non-conventional methods at the architecture level to protect vital signal processing functions of a GPS receiver against internal errors.

The work done on this WP was centered on the PhD of Mourad Hafhadi [1]. The contributions are of 3 types: methodology, proposition of original error mitigation methods, hardware development and finally, dissemination.

1) Methodology:

UBS has defined a methodology to assess the robustness of error of each component of a GPS tracking loop. This methodology is based on a simple model of noise injection at the output of the component or at its internal state, then the evaluation of the mean square error between the positions given by a noiseless GPS receiver and the position given by the noisy GPS receiver.

2) New error mitigation methods:

Several ideas have been proposed during the RELIASIC project to make DSP a circuit more robust to errors. Those propositions can be classified in different levels of abstraction: algorithm level, reaction in case of error detection and forward hardware detection/correction architecture.

Algorithm level: We have shown that the impact of errors that appear when computing the estimation of the Doppler offsets in a GPS application can be greatly reduced by tuning appropriately the carrier filter bandwidth. The effectiveness of the proposed solution was proven by comparing the performance a faulty GPS receiver to a non-faulty (noisy-free) GPS receiver at two levels: the standard deviation of the tracking error variance (by theory and by simulation) as well as the standard deviation between positions given by the faulty (noisy) and non-faulty (the noiseless) GPS receivers. Modifying properly the optimal filter bandwidth values gives astonishing results in terms of robustness against errors. The modification of the bandwidth filter values introduces almost no position degradation in a noise-free GPS receiver. With 40 % of error, the standard deviation of the error in the position does not exceed two meters by increasing the PLL bandwidth while the tracking loop does not support more than 6% of errors [2].

Reaction in case of error detection: We propose a method to mitigate the impact of error in the first stage of a tracking loop. In linear control theory, the tracking of evolving parameters (time and Doppler frequency of a received GPS signal for example) starts by evaluating the error between the estimated parameters and the real parameters. This error is then used in the feedback loop to update the estimated parameters. If an upset error occurs in this estimator, then the error is propagated in the feedback loops, increasing further the misfit between estimated and real parameters, which potentially creates unrecoverable divergence. When an error is detected, we proposed two methods, respectively the Feedback Freezing Loop (FFL) and the Last Correct Value (LCV) methods) to prevent the divergence of the tracking loop [3].

Direct hardware detection/correction architecture: In the case where a circuit uses several identical linear functions (it is the case of a GPS receiver), we propose to replace the classical Triple Modular Redundancy (TMR) with Duplication with Syndrome-based Correction (DSC). DSC is able to correct any type of single event at a significantly lower cost that the TMR (factor 2 in complexity instead of a factor 3) [4].

A Numerically Controlled Oscillator (NCO) that generates the phase of an estimated frequence is highly redundant in time. This redundance can be used to verify periodically the consistence of the output phase. In case of error, the exact value can be recovered and, more important, the impact of this error can be mitigated by appropriate actions [5].

A GPS receiver synthetizes locally a replica of the spreading sequence of a given satellite with Linear Feedback Shift Register (LFSR) modules. If the state of the LSFR is affected by an upset error, then the corresponding satellite signal must be re-acquired, resulting in high energy expenditure and time delay. We evaluate the performance and complexity of methods based on error correction and modular redundancy to protect efficiently the GPS receiver [6].

3) Hardware design

One part of the work has been to design the hardware. Starting with a previous design done by Arnaud Dion (from ISAE), several tasks have been done. The first one was to separate the tracking loop from the rest of the receiver and change the communication protocol to make it compatible with an ASIC (mainly, to change parallel access to serial access, in order to reduce the number of pins of the ASIC). Then, tests on hardware implementation have been done. It is composed of a computer that manages recording file of sample GPS signal and that perform high level operation (mainly, computation of position from low level data), a mother FPGA board that performs the detection of GPS signal and that generates signal controls to the tracking loop, and finally, a daughter FPGA board that performs the tracking loop. In this way, all testing environment of the future ASIC is ready.

Moreover, an important work has been done to instantiate in VHDL (i.e. in hardware) the different noise mitigation techniques developed during the RELIASIC project. This VHDL code is a reference code for the ASIC design and has been transferred to IMT-Atlantique to perform the ASIC.

4) Dissemination

Several actions of dissemination have been done during the project: a demo night at the DASIP conference [7], a poster at the annual meeting of the GDR SOC-SIP [8]. Moreover, we have made a small film (in French) to explain to a large public the research activities performed in UBS through the RELIASIC COMLAB Project [9]..

5) Hiring employees (UBS)

PhD student: Mourad Hafidhi (Un Circuit Reception GPS Resistant Aux Erreurs Internes De L’electronique), december 2014- November 2017.

Post-doc: Imram Wali (2 months in 2017)

III. WP3: Study on error sensibility.

Objective: This task is devoted to methods and tools for the evaluation of errors effects on hardware computing units. We will focus on simple and advanced arithmetic operators (including their control resources). We proposed to work on 2 complementary sub-tasks: 1) theoretical analysis for modeling the fault impact on arithmetic circuits, 2) FPGA emulation techniques for an empirical but accurate evaluation of fault impact in arithmetic circuits.

1. Results

Fault-simulation and error analysis framework for design and assessment of arithmetic operators with reduced-precision redundancy

For arithmetic circuits, reduced-precision redundancy (RPR) is considered to be a viable alternative to triple modular redundancy, as it offers significant power reduction. However, efficient implementation and assessment of hardware arithmetic operators with RPR is still a challenge. We have proposed a lightweight RPR design methodology that exploits the capabilities of modern synthesis and simulation tools to simplify the design and verification of robust arithmetic operators. The implementation of the proposed methodology to a case study of 12×6-bit multiplier shows that leveraging the capabilities of modern synthesis tools to implement reduced-precision operators starting from their behavioral descriptions, not only simplifies their verification but also results in comparable intrinsic-error vs. cost outcome to that of a handcrafted or structurally pruned RP multiplier previously proposed in the state-of-the-art [7].

Fault-tolerant task scheduling algorithm analysis for embedded systems

Many methods have been introduced to conceive fault-tolerant systems to improve reliability. Nevertheless, several of them are not suitable for real-time embedded systems since they incur significant overheads. Other methods may be less intrusive but at the cost of being too specific to a dedicated system. We investigated fault-tolerant design of real-time multi-processor systems and in particular the dynamic mapping and scheduling of tasks on embedded systems. We developed several run-time algorithms based on the primary/backup approach which is commonly used for its minimal resources utilization and high reliability. Aim of our work was to reduce the complexity of the algorithm in order to target real-time embedded systems without sacrificing reliability [8, 9]. This work has been done in collaboration with Oliver Sinnen, PARC Lab., the University of Auckland, New Zealand.

3. Hiring employees (IRISA only)

Master 1 student: Ronan Mauguen (2 months in 2015)

Master 2 students: Petr Dobias (6 months in 2017), Adrien Gaonac’h (6 months in 2018)

PhD student: Van Huu Long Nguyen (PhD started in October 2014. V.H.L Nguyen decided to stop his PhD after 11 months)

Post-doc: Imram Wali (9 months in 2016-2017)

WP4: Integration of reliability in the design process.

Objective: This WP aims at defining a methodology and a model at high level of abstraction (close to the final component) supporting the design of reliable applications on unreliable silicon. In this project we will focus on ways to consider reliability in models and methodology to provide costs of reliability in the considered solution.

  1. High-level reliability evaluation

Analyzing the failures risks in early design phase is an important work. From the point of view of functional design, the architectural models require software and hardware components to fully implement functional behaviors. For that reason, both software and hardware components are considered to accurately evaluate the safety of a complete system.

The model developed in this project allows to analyze the impacts of each functional component on the system reliability during the design process. Firstly, from the initial system functional description, an initial implementation is proposed and from the corresponding Fault Tree Analysis (FTA) we found out which components are critical by determining their Importance Criticality (IC) and Birnbaum’s Importance (IB) factors. Secondly, the identified critical components are protected using safety approach. Finally, a comparison between the unprotected system and the protected one is conducted. In this case, the assessment is represented by the reliability improvement factors. Defining the reliability assessment by two criteria (importance factors and reliability improvement factors) helps the designer at a very early stage of the design phase, and permit to identify which critical function has to be protected to achieve the system safety requirement.

This analysis enables then the designers to drive choice to trade of between level of reliability and hardware costs overhead. In the framework of high level functional design, each alternative design has to be evaluated, then the reliability quantification should be performed for each solution and the cost of each reliability solution has to be considered. A journal paper [a] has been submitted on this topic. It is still under review.

Power Modeling

New power models of simple digital operators have been provided and destined to be included in the global methodology. These high-levels models of components are based on neural networks and aim at being simulated very fast in the design exploration process. The training patterns of these networks have been obtained using real measurements performed on FPGA prototyping boards [10].

Bayesian model for the analysis of the GPS reliability: From the GPS model software shared in the project, we investigated the model of Bayesian Networks in order to explicitly describe the impact of the environment (weather, satellite position, urban zone, …) onto the reliability of the GPS. The model proposed for the reliability analysis of GPS relies on the referenced errors found in the literature. Then we have elaborated embedded diagnosis that can be implemented on FPGA [11, 12].

3. Hiring employees

Master 2 student (INSA-Rennes) : Romain Bernier (6 months in 2018)

Master 2 student (UBO) : Chabha Hireche (6 months in 2015)

Post-doc (Univ. Nantes) : Khanh Le-Son (12 months in 2016-2017)

WP5: ASIC DESIGN

Objective: This task involves the on-chip implementation of a circuit implementing two versions of the chosen application. The first one is a straightforward implementation without any fault compensation technique. Therefore, place-and-route tools will automatically generate the layout of the circuit. The second one will be compensated for the identified faults using the techniques developed in WP2.

1 Results

The ASIC implements 5 GPS trackers, each able to track four satellites. The first block is a non-hardened version of the tracker. The others implement hardened versions developed at WP2. The chosen integration allows injecting faults in several ways. For instance, one of them consists in modifying the gates noise margins through the biasing modification of the MOS substrate, another one by injecting noise into the substrate while decreasing the supply voltage such as to bias the trackers into the subthreshold region. The effects of the later will be to exponentially amplify the chip process and temperature variations. Then we will be able to validate the strategies used to hardened digital circuit for fault.

Besides the ASIC design, a test card to test the ASIC has also been design and is currently being assembled. It will allow testing the ASIC on its own and can be easily interfaced to the FPGA board the other partners used to emulate the rest of the GPS system.

The tape out has been finished in mars [16]. The chip will be given back in july 2019.

2. Hiring employees (IMT-Atlantique)

Research engineer: Rémi Pallas (25 months 2017-2019)

Post-doct: Benoît Larras (6 months in 2016)

Conclusion

The project is now ended but the scientific exploitation and dissemination is not ended, all the contrary! With the arrival of the RELIASIC chip in July 2019, a new chapter of the project will start: real measurement will allow us to assess the real effectiveness of the proposed methods and, along with journal publications, will trigger new scientific questions.

Finally, it is worth mentioning that the topic on reliability on hardware architecture was almost new for all partners. It was a great opportunity for all of us to generate some background in the topic of hardware reliability. So far, work about fault tolerance is extended with 3 new PhDs in Brest, Nantes and Lannion. New synergy opportunities to share the new background should appear in the future.

6) References

Submitted

[a] Sumitted to IEEE Transactions on Emerging Topics in Computing: Khank Le-Son, Olivier Pasquier, Sébastien Pillement, “Reliability aware design and analysis method for digital systems: from gate to system level”.

Published

[1] Mourad HAFIDI, PhD Thesis, “GPS on stochastic architecture”, Université de Bretagne Sud, Novembre 2018. Available on line

http://www-labsticc.univ-ubs.fr/~boutillon/articles/these/Thesis_M_Hafidhi_2017.pdf

[2] M. M. Hafidhi and E. Boutillon, “Improving the Performance of the Carrier Tracking Loop for GPS Receivers in Presence of Transient Errors due to PVT Variations,” 2016 IEEE International Workshop on Signal Processing Systems (SiPS), Dallas, TX, 2016, pp. 80-85.doi: 10.1109/SiPS.2016.22 https://hal.archives-ouvertes.fr/hal-01391201v1

[3] M. M. Hafidhi, E. Boutillon and C. Winstead, “Reducing the impact of internal upsets inside the correlation process in GPS Receivers,” 2015 Conference on Design and Architectures for Signal and Image Processing (DASIP), Krakow, 2015, pp. 1-5. doi: 10.1109/DASIP.2015.7367264 https://hal.archives-ouvertes.fr/hal-01211180v1

[4] M. M. Hafidhi and E. Boutillon, “Hardware error correction using local syndromes,” 2017 IEEE International Workshop on Signal Processing Systems (SiPS), Lorient, 2017, pp. 1-6. doi: 10.1109/SiPS.2017.8109995 https://hal.archives-ouvertes.fr/hal-01611117v1

[5] M. Dridi, M. M. Hafidhi, C. Winstead and E. Boutillon, “Reliable NCO carrier generators for GPS receivers,” 2015 Conference on Design and Architectures for Signal and Image Processing (DASIP), Krakow, 2015, pp. 1-5. doi: 10.1109/DASIP.2015.7367266 https://hal.archives-ouvertes.fr/hal-01211192v1

[6] M. M. Hafidhi, E. Boutillon and C. Winstead, “Reliable gold code generators for GPS receivers,” 2015 IEEE 58th International Midwest Symposium on Circuits and Systems (MWSCAS), Fort Collins, CO, 2015, pp. 1-4. doi: 10.1109/MWSCAS.2015.7282164 https://hal.archives-ouvertes.fr/hal-01151895v1

[7] I. Wali, E. Casseau, A. Tisserand, “An Efcient Framework for Design and Assessment of Arithmetic Operators with Reduced-Precision Redundancy”, Conference on Design and Architectures for Signal and Image Processing (DASIP), Dresden, Germany, pp.1-6, September 27-29, 2017. https://ieeexplore.ieee.org/document/8122117

[8] P. Dobias, E. Casseau, O. Sinnen, “Comparison of Different Methods Making Use of Backup Copies for Fault-Tolerant Scheduling on Embedded Multiprocessor Systems”, Conference on Design and Architectures for Signal and Image Processing (DASIP), Porto, Portugal, pp.1-6, October 10-12, 2018. https://ieeexplore.ieee.org/document/8597044

[9] P. Dobias, E. Casseau, O. Sinnen, “Restricted Scheduling Windows for Dynamic Fault-Tolerant Primary/Backup Approach-Based Scheduling on Embedded Systems”, SCOPES ’18: 21th International Workshop on Software and Compilers for Embedded Systems, Sankt Goar, Germany, May 28-30, 2018. https://dl.acm.org/citation.cfm?doid=3207719.3207724

[10] Nasser, Y., Prevotet, J. C., & Hélard, M. (2018, May). Power modeling on FPGA: a neural model for RT-level power estimation. In Proceedings of the 15th ACM International Conference on Computing Frontiers (pp. 309-313). ACM.

[11] Sara Zermani, Catherine Dezan, Chabha Hireche, Reinhardt Euler, and Jean-Philippe Diguet. Embedded and Probabilistic Health Management for the GPS of Autonomous Vehicles. In 5th Mediterranean Conference on Embedded computing, Bar, Montenegro, June 2016.(MECO Award of “Gratitude for contribution in scientific and research work”)

[12] Sara Zermani, Catherine Dezan, Chabha Hireche, Reinhardt Euler, and Jean-Philippe Diguet. Embedded Context aware Diagnosis for a UAV SoC platform. Microprocessors and Microsystems : Embedded Hardware Design (MICPRO), 51 :185–197, June 2017

Dissemination

[13] M. M. Hafidhi, E. Boutillon and A. Dion, “Demo: Localisation in a faulty digital GPS receiver,” 2016 Conference on Design and Architectures for Signal and Image Processing (DASIP), Rennes, 2016, pp. 223-224. doi: 10.1109/DASIP.2016.7853824

[14] Mohamed Hafidhi, Emmanuel Boutillon, “Reliable GPS position on an unreliable hardware”, poster, GDR SOCSIP, June 2016, Nantes, France. https://colloque-socsip.ietr.fr/#page=home

[15] General audience presentation video of CominLab’s RELIASIC project, 2016 (in French):
http://www-labsticc.univ-ubs.fr/~boutillon/un_arch/reliasic_film.mp4

[16] Lab-STICC news about the tape-out of the ASIC circuit: https://www.labsticc.fr/en/news/1187-cominlabs-reliasic-a-new-circuit-sent-to-foundry.htm.

Comments are closed.