This section of documentation presents the different modes that Damaris can be configured, that are: dedicated cores, dedicated nodes and synchronous mode.
Enabling Different Modes
Different modes can be enabled simply by setting the number of dedicated cores or dedicated nodes in the XML file, as exemplified in below listing.
1 2 3 4
<architecture> <dedicated cores="2" nodes="0" /> ... </architecture>
The synchronous mode, or “time-partitioning”, is applied when cores=”0″ and nodes=”0″. In this conguration, all actions are executed by the clients themselves. In other words, a client is its own server. There is thus no distinction between the “core” and the “group” scope, and “bcast” events are not executed (because there is no application logic to asynchronously receive the event in other clients, and they can be replaced by simple MPI_Bcast logics from within a normal, “core” event). Note that in this mode, the shared-memory buffer is not used and thus, not created.
When the cores field has a value N > 0, dedicated cores are used. This number of dedicated cores must divide the number of cores in a node. When N = 1, all clients in a node are connected to a server running on a dedicated core of the same node. When N > 1, the client cores in a node are gathered into N groups, and each group is connected to a server on a dedicated core. Note that whatever the number of servers, only one shared-memory buffuer is created, with a size specified in the XML file.
Disabling shared memory
By default, the dedicated-cores mode leverages a shared-memory buffer of the size specified in the configuration file. Yet it is possible to replace the communication through shared memory with communication through MPI. To do so, add enabled=”false” as an attribute in the buffer’s definition. In general, the shared-memory communication layer reduces the number of copies of data that are performed. A dedicated core can indeed work from variables that have been placed in shared memory by the clients. It is thus preferable to use shared memory if the simulation already requires a large amount of memory, and in particular if the simulation has been optimized to use damaris_alloc() instead of damaris_write(). It is also preferable to use shared memory to properly bound the amount of data transferred from clients to servers, in particular if servers are executing plugins that take a long time to complete. By disabling the shared memory, clients will keep issuing send requests even though the server is too busy answering them, leading to potential buffer overflow within MPI.
When the cores field is set to 0 or unspecied, and the nodes field is set to N > 0, then N nodes will be dedicated to run servers. N must divide the total number of nodes of the application. On each of the dedicated nodes, Damaris will spawn as many server instances as cores. Each dedicated node holds a single buffer of the specfied size for all its servers. The clients will be gathered in equally-sized groups and each group will be connected to a server.