0

I am trying to program an MPI_Alltoallv using an MPI Derived datatype using MPI_Type_create_struct. I could not find any examples solving this particular problem. Most examples like this perform communication(Send/Recv) using a single struct member, whereas I am targeting an array of structs. Following is a simpler test code that attempts a MPI_Sendrecv operation on an array of structs created using DDT:

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <stddef.h>

typedef struct sample{
  char str[12];
  int  count;
}my_struct;

int main(int argc, char **argv)
{
    int rank, count;
    my_struct *sbuf = (my_struct *) calloc (sizeof(my_struct),5);
    my_struct *rbuf = (my_struct *) calloc (sizeof(my_struct),5);
    int blens[2];
    MPI_Aint displs[2];
    MPI_Aint baseaddr, addr1, addr2;
    MPI_Datatype types[2];
    MPI_Datatype contigs[5];
    MPI_Status status;

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    strcpy(sbuf[0].str,"ACTGCCAATTCG");
    sbuf[0].count = 10;
    strcpy(sbuf[1].str,"ACTGCCCATACG");
    sbuf[1].count = 5;
    strcpy(sbuf[2].str,"ACTGCCAATTTT");
    sbuf[2].count = 6;
    strcpy(sbuf[3].str,"CCTCCCAATTCG");
    sbuf[3].count = 12;
    strcpy(sbuf[4].str,"ACTATGAATTCG");
    sbuf[4].count = 8;

    blens[0] = 12; blens[1] = 1;
    types[0]  = MPI_CHAR; types[1]  = MPI_INT;
    for (int i=0; i<5; i++)
    {
       MPI_Get_address ( &sbuf[i], &baseaddr);
       MPI_Get_address ( &sbuf[i].str, &addr1);
       MPI_Get_address ( &sbuf[i].count, &addr2);
       displs[0] = addr1 - baseaddr;
       displs[1] = addr2 - baseaddr;

       MPI_Type_create_struct(2, blens, displs, types, &contigs[i]);
       MPI_Type_commit(&contigs[i]);
      }

    /* send to ourself */
     MPI_Sendrecv(sbuf, 5, contigs, 0, 0,
             rbuf, 5, contigs, 0, 0,
             MPI_COMM_SELF, &status);

     for (int i=0; i<5; i++)
          MPI_Type_free(&contigs[i]);

     MPI_Finalize();

     return 0;
 }

I get the following warning at compile time:

    coll.c(53): warning #810: conversion from "MPI_Datatype={int} *" to "MPI_Datatype={int}" may lose significant bits
       MPI_Sendrecv(sbuf, 5, contigs, 0, 0,
                             ^

    coll.c(54): warning #810: conversion from "MPI_Datatype={int} *" to "MPI_Datatype={int}" may lose significant bits
               rbuf, 5, contigs, 0, 0,

And observe the following error across all processes:

    Rank 0 [Thu Jun 16 16:19:24 2016] [c0-0c2s9n1] Fatal error in MPI_Sendrecv: Invalid datatype, error stack:
    MPI_Sendrecv(232): MPI_Sendrecv(sbuf=0x9ac440, scount=5, INVALID DATATYPE, dest=0, stag=0, rbuf=0x9ac4a0, rcount=5, INVALID DATATYPE, src=0, rtag=0, MPI_COMM_SELF, status=0x7fffffff6780) failed

Not sure what I am doing wrong. Do i need to further use "MPI_Type_create_resized " to register the "extent"? If so, an example quoting the above scenario would really help.

Also my main goal is to perform "MPI_Alltoallv" using a similar array of structs (of size ~ several thousands). Hopefully if I can get the SendRecv to work I can move on to "MPI_Alltoallv".

Any help would be highly appreciated.

Community
  • 1
  • 1
PGOnTheGo
  • 805
  • 1
  • 11
  • 25

1 Answers1

0

The sendtype and recvtype parameters expect a parameter of type MPI_Datatype. What you're passing in is an array of these, i.e. a MPI_Datatype *.

You can only use one of these array elements at a time to pass to this function.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • I thought so too. But if I pass only one instance of "contigs", then how will the displacements for the remaining 4 entries be received at the receiver side? Dont I need to send all instances of the "contigs" datatype? – PGOnTheGo Jun 16 '16 at 23:54
  • It looks like all 5 entries need to have a common send/receive type. If not, you'll need to send them one at a time. – dbush Jun 16 '16 at 23:57
  • 1
    All your entries are the same type - they are all of type my_struct. You only need to define a single MPI_Datatype which will be the same for all the structs. If you specify contigs[0] for both types in your Sendrecv call then the code compiles and runs OK (although I don't know if it's correct). The crucial point is that the displacements in the structure definition are all relative, and since C guarantees that all instances of a structure have the same layout then you only need to define a single new datatype. – David Henty Jun 17 '16 at 09:03