-
Notifications
You must be signed in to change notification settings - Fork 1
Usage Guide
Using DPU-MPI is relatively straightforward.
At a minimum, you will need to include the following:
#include "dpulib.h" // Always required
#include "get_ip.h" // IP utilities
After initializing MPI as usual, you’ll need to initialize the DPU connection.
If you know the DPU IP address offset, you can simply specify it in the second
argument to offset_addr
rather than using a command-line argument.
You’ll also need to select your interface name, like ib0_mlx5
.
The port is currently fixed at 9999. The third argument, 32
is the maximum
number of simultaneous pending operations to support.
char *ip = offset_addr("ib0_mlx5", atoi(argv[1]));
char *port = "9999";
DPUContext *ctx = DPU_Init(ip, port, 32);
if (!ctx)
{
// Some error occurred
return 1;
}
Then, you can call the DPU_MPI_Ialltoall function almost like MPI_Ialltoall.
Memory registration and transfers occur in this function, so there is no need
to register any MRs before calling this.
int index = DPU_MPI_Ialltoall(
ctx, sndbuf, 1, MPI_UINT32_T,
rcvbuf, 1, MPI_UINT32_T, worldsize );
if (index < 0)
{
fprintf(stderr, "Bad response.\n");
return 1;
}
int cookie1 = get_cookie(ctx, index);
ctx
is your DPU context created earlier,
sndbuf/rcvbuf
are the buffers,
MPI_UINT32_T
is your datatype,
1
/1
are the send/receive counts in that order.
worldsize
should be set to MPI_Comm_size(MPI_COMM_WORLD)
.
The return value is the index inside the queue that can be used to check your
job status. Then you can get the cookie by calling get_cookie()
. This cookie
can be used to check the job_status just like an MPI_Request
object.
In this library, there are a few ways to check the job status:
DPU_MPI_Wait
, analogous to MPI_Wait
, or
DPU_MPI_Test
, which is analogous to MPI_Test
,
as well as a specific function DPU_MPI_Poll
/ DPU_MPI_Longpoll
to poll the InfiniBand
CQ using ibv_poll_cq
and ibv_req_notify_cq
respectively.
For most use cases, simply use DPU_MPI_Wait
with the cookie. If you would like to poll from
time to time, use DPU_MPI_Test
. The polling routines are called internally.
ret = DPU_MPI_Wait(ctx, cookie1);
if (ret)
{
fprintf(stderr, "DPU Poll Failed\n");
return 1;
}
Just like MPI_Wait this routine blocks until the cookie is completed. That is essentially all that is needed to convert MPI_Ialltoall to use the DPU. Finally, ensure you gracefully exit when done, terminating the server job.
DPU_Exit(ctx);
There are some server-side options you can adjust.
LAZY_UNPINNING
is on by default and only deregisters MRs when absolutely necessary,
at the cost of increased memory usage.
MAX_QUEUE
specifies the server’s maximum number of requests. This should match or
exceed the number requested when calling DPU_Init from your client.
// Lazy MR unpinning
#define LAZY_UNPINNING 1
// Maximum number of simultaneous jobs
#define MAX_QUEUE 32