A file in the GCF (Grid Configuration File) format has a .gcf extension and specifies a list of "instances" (i.e., distributed processes) to be launched by the CADP tools providing distributed verification, such as distributor , bcg_merge , and the PBG tools as well: pbg_cp , pbg_info , etc. Such a tool will be called the "parent program" in the sequel. The GCF format also specifies the various variables used for launching, connecting and parameterizing instances on remote machines.
Each instance corresponds to a pair (machine, directory). For each instance, a distributed process executing on machine will be launched by the parent program. This distributed process will store its files in directory located on some filesystem of machine. directory is called the "working directory" of the instance.
Instances may or may not be launched on the local machine, depending on the constraints on grid usage. For instance, clusters often distinguish between one frontal node (i.e., the local machine) used to submit jobs to the cluster, and many computing nodes (i.e., the remote machines) that perform distributed computations.
Each instance is given a unique number greater or equal to 1. Numbers are assigned in the same order as instances appear in the grid configuration file. Number 0 is reserved for the process corresponding to the execution of the parent program on the local machine.
If the grid configuration file contains duplicated instances (i.e., same machine and same directory), only the last one is considered, the previous ones being entirely discarded.
There should be at least two different instances. Several instances may execute on the same machine, provided that their working directories are different. Also, a working directory may be either local to its machine or shared between several machines (using NFS, Samba, etc.).
The grid configuration file also specifies the value of (predefined) variables used to control the way instances are launched and communicate with each other.
The grid configuration file has the following syntax:
<gcf> ::= <global_opt> <instance_opt>* <global_opt> ::= <directive>* <instance_opt> ::= <machine> <machine>* <directive>* <machine> ::= <address-or-name> "\n" <directive> ::= <variable> "=" <value> "\n" <variable> ::= "buffer_size" | "cadp" | "connect_timeout" | "directory" | "files" | "hash" | "memory" | "port" | "rcp" | "rsh" | "time" | "user" <value> ::= <character-string>
where:
<global_opt>
is a (possibly empty) list of directives applicable to
every instance. <instance_opt>
defines one or several instances together
with a (possibly empty) list of directives applicable to these instances
exclusively. <machine>
is the name of a (local or remote) machine, either
given as a numerical IP address (e.g., "138.96.146.2") or as an Internet machine
name (e.g., "vasy.inria.fr"), followed by a newline character. <directive>
is
an assignment of a <value>
to a <variable>
, followed by a newline character.
Any line starting with the "#" character is considered as a comment and ignored. Spaces and tabulations can be inserted before, between, or after terminal symbols.
Variables can be modified by directives as
follows. First, some variables have a default value, which will be used
unless overriden by some directive. A <directive>
occurring in the <global_opt>
list assigns its <variable>
for all machines. A directive occurring in a
<instance_opt>
list assigns its <variable>
only for the machines mentioned
in this <instance_opt>
, possibly overriding the value specified for this
<variable>
in the <global_opt>
list.
The meaning of the different variables is the following:
buffer_size
:<global_opt>
list, meaning that all communication buffers must have the same size. Maximal
value is 2,147,483,648. Default value is 65536.
cadp
:
The pathname may be the same for all machines (e.g., if the
CADP toolbox is installed on a common filesystem shared between all machines)
or may be different on each machine. In any case, the CADP toolbox should
be properly installed on the local and remote machines. Default value is
given by the $CADP
environment variable.
connect_timeout
:<global_opt>
list,
meaning that each connection must succeed within the specified timeout.
Default value is 20 seconds.
directory
:This pathname can be either an absolute one (starting with a "/" character) or a relative one, in which case it is interpreted with respect to the current directory on the local machine and to the user's home directory on remote machines. It may contain spaces provided that the entire pathname is enclosed between double quotes. It should not contain the tilde character "~". Default value is the user's home directory (which is not recommended unless when using a cluster of machines).
files
:For instance, when running the parent program distributor using bcg_open , the spec.bcg file must be present on the remote machines. Similarly, when running distributor using exp.open , all the BCG files referenced in spec.exp must also be present on the remote machines.
This variable may specify either a single filename or a set of filenames separated by spaces. Filenames may contain the wildcard characters ("*", "?", "[", "]", ...) according to the Bourne shell conventions. Default value is empty.
hash
:CAESAR_STATE_
n_HASH()
of the
caesar_hash
library of OPEN/CAESAR. This variable can only be set
in the <global_opt>
list, meaning that all instances use the same hash function.
Default value is 3.
memory
:
port
:
rcp
:$PATH
environment variable. Its
command line syntax should be the same as for the rcp(1) command, i.e.,
Default value is rcp
. Other
possible values are scp
, kcp
, or any similar variant of rcp(1). Remote machines
should be configured in such a way that remote copy can be done automatically
without asking for a password or a passphrase; this is usually done by
setting an appropriate ".rhosts" or ".ssh/authorized_keys" file and/or launching
an SSH agent before running the parent program.
There is also a special
value nfs
, which indicates that all instances have the same working directory
on a filesystem shared between all machines (e.g., using NFS), so that no
remote copy is necessary. Setting variable rcp
to the value nfs
can only
be done in the <global_opt>
list.
rsh
:$PATH
environment
variable. Its command line syntax should be the same as for the rsh(1) command,
i.e.,
Default value is rsh
. Other possible
values are ssh
, krsh
, or any similar variant of rsh(1). Remote machines
should be configured in such a way that remote authentication can be done
automatically without asking for a password or a passphrase; this is usually
done by setting an appropriate ".rhosts" or ".ssh/authorized_keys" file and/or
launching an SSH agent before running the parent program.
It is possible
to supply options to the command assigned to the rsh variable (such as
rsh=ssh -6
) but certain options (e.g., ssh -v
) program.
time
:
Alternatively, one can obtain
the same effect by setting the (undocumented) $CADP_TIME
environment variable;
this is equivalent to setting in the ".gcf" file the "time" variable for
all remote instances.
user
:
In its simplest form, a grid configuration file may simply list the remote machines to be used:
machine1 machine2 machine3 machine4
It is also possible to write a grid configuration file with several instances on the same machine:
machine1 directory=/tmp/instance1 machine1 directory=/tmp/instance2 machine1 directory=/tmp/instance3
This is a more complex example involving all the features described above:
buffer_size = 32768 cadp = /usr/local/cadp connect_timeout = 10 directory = /home/vasy/distributor files = graph-*.bcg hash = 4 port = 8016 rcp = scp rsh = ssh user = inria machine1.domain.org machine2.domain.org user = vasy machine3.domain.org directory = /users/inria/distributor
Notice that the directive "user = vasy
" applies to both machine1
and machine2
.
[GMS01] Hubert Garavel, Radu Mateescu, and Irina Smarandache. Parallel State Space Construction for Model-Checking. In Matthew B. Dwyer, ed, Proceedings of the 8th International SPIN Workshop on Model Checking of Software (SPIN'01), Toronto, Canada, LNCS 2057, pp. 217-234, May 2001. Revised version available as INRIA Research Report RR-4341, December 2001. Available from http://cadp.inria.fr/publications/Garavel-Mateescu-Smarandache-01.html
[GMB+06] Hubert Garavel, Radu Mateescu, Damien Bergamini, Adrian Curic, Nicolas Descoubes, Christophe Joubert, Irina Smarandache-Sturm, and Gilles Stragier. DISTRIBUTOR and BCG_MERGE: Tools for Distributed Explicit State Space Generation. In Holger Hermanns and Jens Palberg, eds., Proceedings of the 12th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS'06), Vienna, Austria, LNCS 3920, pp. 445-449, March-April 2006. Available from http://cadp.inria.fr/publications/Garavel-Mateescu-Bergamini-et-al-06.html
[GMS12] Hubert Garavel, Radu Mateescu, and Wendelin Serwe. Large-scale Distributed Verification using CADP: Beyond Clusters to Grids. Electronic Notes in Theoretical Computer Science, vol. 296, pp. 145-161, August 2012. Available from http://cadp.inria.fr/publications/Garavel-Mateescu-Serwe-12.html
Additional information is available from the CADP Web page located at http://cadp.inria.fr
Directives for installation are given in files $CADP/INSTALLATION_*.
Recent changes and improvements to this software are reported and commented in file $CADP/HISTORY.