Data Description

In previous pages, you learnt how to instrument a simulation to benefit from Damaris. Now, it is time to describe and then write simulation data to Damaris. This page explains how to describe data in the Damaris XML file and how to access the values at run time. All the data items must be described in the XML configuration within the <data> section.

Parameters

Parameters are simple values associated with a name and a type. They allow us to easily change some important values of the simulation automatically at runtime and Damaris updates all variables and objects that depend on them. In the following, you can see how to define a parameter with a default value. Parameters must be defined at the root of the <data> section, not within nested groups (explained later).

1
2
3
 <data>
    <parameter name="w" type="int" value="2" />
</data>

Parameters must have a valid name, i.e., they should not contain any space or special symbols, and should not start with a digit. Basically any name that is valid as a variable name in C or Fortran is a valid parameter name in Damaris. The type attribute must be one of the types listed in Damaris data types. The provided value must be acceptable for the specified type. For instance value=”xyz” is not valid for an integer parameter.

The below listings show how to get and set parameters at run time from the simulation. Please note that when a process modifies a parameter, this modification is local and not visible to other processes or to dedicated cores/nodes. The user is responsible for synchronizing all processes when a parameter has to be changed globally.

1
2
3
4
5
 int w_in ;
 int err = damaris_parameter_get ("w", &w_in , sizeof (int)) ;
 
 int w_out = 4 ;
 int err = damaris_parameter_set("w", &w_out , sizeof (int)) ;

And for Fortran:

1
2
3
4
5
6
7
8
 integer: : w_in , w_out
 integer&nbsp;:: ierr
 ...
 
 call damaris_parameter_get_f ("w" , w_in , SIZEOF(w_in) , ierr )
 
 w_out = 4
 call damaris_parameter_set_f ("w" , w_out , SIZEOF(w_out) , ierr )

Parameters are especially useful in the definition of layouts, which are described in the following part of this document.

Layouts

Layouts describe the shape of the data. Damaris separates the description from the description of the variables themselves since several variables can have the same layout. A layout is characterized by a name, a base type and a list of dimensions. The description allows Damaris to allocate the correct amount of shared memory needed for a variable. Two examples are given in the following listing.

1
2
3
4
 <data>
    <layout name="a 3D layout" type="int" dimensions="1,4,32" />
    <layout name="a parameterized layout" type="float" dimensions="w4 ,h/2", global="100*w, 8*h" />
</data>

The name of a layout can be any string not including the “/” character (it can include special characters, white spaces, numbers, etc. even though we advise to keep it simple and use classic C identifiers). The name cannot be the name of a basic type (such as int). The type should be one of the basic types listed in Damaris data types. The list of dimensions is a comma-separated list of arithmetical expression featuring +, -, *, /, %, parenthesis, numbers and any defined int parameters (as of today, only int and integer parameters are allowed, short and long will produce an error). The global attribute is specified so that output functionality that needs a global size for the data can efficiently allocated it (such as HDF5 Collective output) In the example above, the first layout has three dimensions and the second one has two dimensions.
When a layout depends on parameters, any modification of the parameters will automatically modify the layout. Like parameters, the layout is only locally affected and the modification is not propagated to other processes. Making a layout depend on a parameter and changing the parameter at run time can be very useful for particle-based simulations, where the number of particles is different at each process, or simply to make the XML file independent of the simulation size and avoid changing the parameters every time the size changes.
The description of a layout will be used when writing a variable. Note that a modification in a layout at run time (through a modification of a parameter) will not affect previously-written iterations of a variable. It only affects the upcoming ones.

Note: string and label data types are variable-length types, thus the dimension of the layout corresponds to the number of characters that the variable can store.

Variables and groups

XML description

Actual data is described through the <variable> and <group> nodes, which allow to build a hierarchy of variables. An example is given as depicted below. Each variable must be given a name, and be associated with a defined layout. These are the two mandatory attributes. The time-varying attribute indicates whether a variable is expected to be written at every iteration, or just once at the beginning of the simulation (i.e., before the first call to damaris_end_iteration). The visualizable attribute indicates that a variable is visualizable by visualization backends (for instance, coordinate arrays
are not visualizable). In the following example, we expect the x coordinates variable to be the coordinate of points in a rectilinear grid. This data is not itself visualizable, however a rectilinear grid (later described) will be a visualizable object. Note that one can use a basic type instead of a layout, as for the simple var variable: basic types are already interpreted as layouts.

1
2
3
4
5
6
7
8
  <data>
    <variable name="x_coordinates" layout="a layout" visualizable="false" 
              time-varying="false" />
    <group name="my group">;
        <variable name="temperature" layout="some other layout" />
    </group>
    <variable name="simple var" layout="int" />
</data>
Relative and absolute names

Variables and layouts can be defined within groups. In Listing 6, the relative name of the temperature variable is “temperature“, while its absolute name is “my group/temperature“. The same goes for the
name of layouts.
When Damaris searches for a layout associated with a variable, it first looks inside the group where the variable is defined, then in the parent group, and so on. It is thus possible to refer to a layout either with its absolute name (if the layout is located in a different group) or with a relative name if the layout can be found in the same group hierarchy.

Writing full variables

Now that the data is described, we can write it from the simulation. The two samples codes that are shown below present how to write a variable. The full name of the variable should be provided.

1
2
3
float data ;
 ...
 int err = damaris_write("my group/temperature" , data) ;

And for Fortran:

1
2
3
4
 real , dimension ( : ) :: mydata
 integer :: ierr
 ...
 call damaris_write_f ("my group/temperature" , mydata , ierr )
Writing multiple domains

By default, Damaris expects one block of data per client and per variable. It may be necessary for a client to write multiple blocks of a single variable. To do so, the number of domains have to be specified in the domains section of the XML file, as shown here.

1
<domains count="4"/>

This number is a maximum, a client may write less domains, but not more. Besides, if a client writes N domains, it must identify them from 0 to N-1. The below sample code shows how to write multiple blocks. Each block is expected to have the size and shape defined in the layout associated with the variable.

1
2
3
4
float data[4];
 ...
 for (i=0; i < 4; i++)
 int err = damaris_write_block("my group/temperature" , i , data);

And for Fortran:

1
2
3
4
 real , dimension(0:3,:) :: mydata
 integer :: i , ierr
 ...
 call damaris_write_block_f("my group/temperature" , i , mydata (i) , ierr )

A Visual Example

A more complete example of a simple decomposition of a 2D matrix into a distributed variable is as follows and is similar to what is present in the Damaris examples directory (providing Damaris was installed with HDF5 support)

1
2
3
4
cd <damaris install path>/examples/damaris/storage 
 
# 4 Damaris clients and 1 Damaris server
mpirun -np 5 ./2dmesh 2dmesh.xml

An excerpt of the 2dmesh.xml file is as follows:

1
2
3
4
5
6
7
8
9
10
<data>
<paramater name="HEIGHT" type="int" value="8" />
<paramater name="WIDTH" type="int" value="4" /> 
<parameter name="size" type="int" value="2"/> 
<layout name="var_layout" type="float"
            dimensions="HEIGHT/size, WIDTH/size"
            global="HEIGHT, WIDTH" />
<variable name="var_array" layout="var_layout"
           time-varying="true" />   
</data>

The full dataset can be visualised like this:

Next we show the decomposition into MPI ranks and the use of damaris_set_position() to specify the offsets of the global decomposition:

1
2
3
4
5
6
7
8
9
// 2D domain
int dims = 2 ;   
// array for offset values    
int64_t pos[dims]; 
// values dependent on MPI process 
// This is for Damaris client process 0
pos[0] = 0 ;         
pos[1] = 0 ;
damaris_set_position("var_array", pos)

In the following we see the final result after the damaris_write() call that places a copy of the data into shared memory and is now available for the Damaris server rank to process

Layouts may also specify a ghosts  attribute that describes a constant padding on the boundary edges of a data array. Ghost attributes may also use parameters to allow for a changing amount of padding. Each dimension of an array specifies 2 ghost values, one for each edge of the data along a particular axis.

An example of the layout XML with the ghosts attribute added and a diagram that visualizes the layout for gc0 and gc1 = 2 and gr0 and gr1 = 1 is presented below:

1
2
3
4
5
6
7
<paramater name="gc0" type="int" value="2" />
<paramater name="gc1" type="int" value="2" />
<paramater name="gr0" type="int" value="1" />
<paramater name="gr1" type="int" value="1" />
...
<layout name="var_layout_with_ghosts" type="float" dimensions="HEIGHT/size+gc0+gc1 ,WIDTH/size+gr0+gr1" 
             global="HEIGHT, WIDTH"  ghosts="gc0:gc1, gr0:gr1" />

The variable that supports the var_layout_with_ghosts above includes the extra padding of the ghost zones, however HDF5 and Visit plugins will automatically remove these extra areas to save the unique data only (ghost zones are usually shared data across ranks who share boundaries.

Meshes

Please see the documentation on in-situ vizulisation for descriptions and examples of the mesh xml

Direct access to the shared memory

The API presented above has the disadvantage of copying local data into the shared memory. A more efficient way of proceeding consists of getting direct access to the shared memory. The two below sample codes present this capability available through the use of damaris_alloc().  If damaris_alloc fails, it will produce a NULL pointer. Otherwise, a valid pointer to an allocated region of shared memory is produced. After writing the data to the returned buffer, a call to damaris_commit will notify the server of the presence of new data. After this call, the user is not expected to write the data anymore. Finally damaris_clear indicates that the client delegates full responsibility to the servers for the data. It is not supposed to be read nor written anymore. Equivalent functions (damaris_alloc_block and damaris_alloc_block_f) exist to allocate blocks when each client handles multiple domains.

Please note, the Damaris variable must be have a valid size before the call to damaris_alloc, so that the pointer requested has the required space in the allocated shared memory region. This size is specified by the variables XML Layout element and this can require setting Damaris parameter value(s) (using damaris_parameter_set()) to allow for dynamic sizing of the memory area being requested.

1
2
3
4
5
6
7
8
9
 float data;
 int err;
 err = damaris_alloc("my group/temperature" , &data);
 ...
 // done writing data for this iteration
 err = damaris_commit ("my group/temperature") ;
 ...
 // done accessing data for this iteration
 err = damaris_clear("my group/temperature") ;

And for Fortran:

1
2
3
4
5
6
7
8
9
10
11
 use Damaris
 type (c_ptr) :: cptr
 integer :: ierr
 real , pointer :: mydata (:,:,:)
 ...
 cptr = damaris_alloc_f ("my group/temperature" , ierr)
 call c_f_pointer(cptr ,mydata , [ 64 , 16 , 4 ] )
 ...
 call damaris_commit_f("my group/temperature" , ierr)
 ...
 call damaris_clear_f("my group/temperature" , ierr)

Due to the C/Fortran interface, the use of damaris_alloc_f is more complex in Fortran than in C. The Damaris module should be used (this module is located where Damaris is installed). An additional call to c_f_pointer is mandatory to convert the returned C pointer into a valid Fortran array. This function takes a shape array to provide the extents along each dimension.
Note that it can be desirable, in order to implement efficient double-buffering, that a client waits some iterations before actually committing a variable. Different versions of the function are available:

  • damaris_commit(const char* var) :
    commits all the blocks of the current iteration;
  •  damaris_commit_block(const char* var, int32 t block) :
    commits one specific block of the current iteration;
  • damaris_commit_iteration(const char* var, int32 t iteration) :
    commits all the blocks of a specific iteration;
  • damaris_commit_block_iteration(const char* var, int32 t block, int32 t iteration)
    commits one specific block of a specific iteration.
  • Equivalent functions are available in Fortran: damaris_commit_f, damaris_commit_block_f,
    damaris_commit_iteration_f and damaris_commit_block_iteration_f. All take an ierr integer in
    addition to the same parameters as the C functions.
Error handling

Since damaris_write and damaris_alloc (and corresponding functions for writing or allocating blocks) both require to allocate a portion of shared memory, a call may fail if the shared memory is full. When this happens all subsequent calls to damaris_write or damaris_alloc up to the next call to damaris_end_iteration will return with an error without attempting to allocate memory. A special signal will be sent to all the servers when calling damaris_end_iteration, which informs the servers that some data are missing for this iteration. By default, the dedicated cores will not update potentially connected visualization backends, and will delete from memory the data that has been written successfully for this iteration. Over-riding this default behavior will be covered in the documentations of Damaris plug-ins.

Next: Damaris Configuration

Comments are closed.