Database of Case Studies Achieved Using CADP

POLYKID CC-NUMA Multiprocessor Architecture.

Organisation: INRIA / DYADE (Bull-INRIA Joint Venture for Advanced Research)

Method: LOTOS

Tools used: CADP (Construction and Analysis of Distributed Processes)

Domain: Hardware Design.

Period: 1997-2000

Size: about 4000 lines of LOTOS (the whole architecture)
about 2000 lines of LOTOS (the cache coherency protocol)
about 3400 lines of LOTOS and 7000 lines of C (the RCC emulator)

Description: POLYKID was an experimental CC-NUMA (Cache Coherent - Non Uniform Memory Access) multiprocessor architecture developed by the Unix Servers division of Bull in Pregnana (Italy). A working prototype of a POLYKID machine running the AIX operating system was built in 1998 and found to perform correctly. The machine consisted of a scalable collection of modules interconnected by a double ring network implementing the lower layers of SCI (Scalable Coherent Interface) standard. Each module consists of four nodes connected by a data bus and an address bus. The data bus is implemented as a crossbar (a 4x4 matrix of switches allowing every processor to communicate directly with every other processor). As the main memory is distributed among the modules, to achieve cache coherency, each module in the architecture implements internally a MESI (Modified, Exclusive, Shared, and Invalid) cache coherency protocol, which performs a refined form of mutual readers-writers exclusion algorithm.

The first part of the experiment concerned the formal specification and verification of the POLYKID architecture (and of the cache coherency protocol in particular). The formal descriptions in LOTOS have been produced while the POLYKID architecture was under design, and therefore they had to evolve regularly in order to keep track of the changes brought by the Bull designers. These LOTOS specifications, modelling several configurations of the architecture (two and three modules, two distinct memory addresses, distinct input and output queues, etc.) have been verified using the CADP toolset. This allowed to discover and correct about 20 problems in the protocol and its documentation (uncovered cases, undocumented features, deadlocks, data consistency violations).

The second part of the experiment dealt with the software emulation of hardware components. The POLYKID architecture contained an ASIC (Application-Specific Integrated Circuit) named RCC (Remote Cache Controller), specially designed by Bull to implement the cache coherency protocol. For testing purposes of the POLYKID prototype, since the RCC was not available at that time, the goal was to produce a software program that would emulate the behaviour of an RCC circuit. This program has been developed partly in LOTOS (the logic of the cache coherency protocol, the data types defining the states and transitions of the cache, and the behaviour part) and partly in C (the interface between the RCC emulator and the transponder circuit implementing the protocols for the system bus and the SCI network). Using the EXEC/CAESAR environment of CADP, the LOTOS code was automatically translated in C and interconnected with the simulation environment used by Bull. The RCC emulator obtained in this way has been used to perform various test suites (unit testing of the transponder, system test at the monitor level, then by booting the firmware, and finally the whole operating system).

The last part of the experiment consisted in generating tests to be applied on the real implementation of POLYKID. The challenge was to demonstrate that the TGV tool, originally developed for conformance testing of communication protocols, could also be used to generate tests for hardware architectures. A formal description of the system under test was derived from a LOTOS specification previously developed during the verification phase. Starting from an existing test plan containing 7 groups of test purposes developed by Bull, the groups 3 and 4 were considered, which dealt with the cache coherency protocol implemented in the RCC circuit. For each test purpose in these groups, the corresponding test cases were generated using TGV (a total of 75 tests, most of which had more than 400 states). Moreover, a tester package has been developed, which allows the automatic execution of tests (both in batch mode and in reactive mode) in the simulation environment used by Bull.

Conclusions: This case-study demonstrated the feasibility and benefits of using formal methods for hardware design at system level. In particular, it illustrates how the LOTOS language, originally intended to the formal specification of communication protocols, together with tools developed for verifying and testing communication protocols, can enhance significantly the industrial approach commonly used to system-level design. The experiment focused on the most complex part of Bull's CC-NUMA POLYKID architecture, namely the cache coherency protocol. From the industrial side, it was acknowledged that the proposed approach was suitable for gaining a better understanding of the system, clarifying issues, detecting design mistakes, improving the overall quality of the product, and reducing the risk of delays.

Publications: [Garavel-Viho-Zendri-01] H. Garavel, C. Viho, and M. Zendri. "System Design of a CC-NUMA Multiprocessor Architecture Using Formal Specification, Model-Checking, Co-Simulation, and Test Generation". Springer International Journal on Software Tools for Technology Transfer (STTT), Volume 3(3), pp. 314-331.

Short version available on-line as INRIA Research Report RR-4041, INRIA, November 2000, from: http://cadp.inria.fr/publications/Garavel-Viho-Zendri-00.html
Contact:
Hubert Garavel
INRIA Rhône-Alpes / VASY
ZIRST - 655, avenue de l'Europe
Montbonnot
38334 Saint Ismier Cedex
France
Tel: +33 (0)4 76 61 52 24
Fax: +33 (0)4 76 61 52 52
E-mail: [email protected]



Further remarks: This case study, amongst others, is described on the CADP Web site: http://cadp.inria.fr/case-studies


Last modified: Fri Jul 20 18:02:10 2018.


Back to the CADP case studies page