2

I want to generate an string with each process and then gather everything. But the strings created in each process are created by appending ints and chars.

I'm still not able to gather everything correctly. I can print all the partial strings one by one, but If I try to print the rcv_string, I only get one partial string or maybe a Segmentation Fault.

I've tried putting zeros at the end of strings with memset, reserving memory for the strings dynamically and statically, ... But I don't find the way.

It would be great if someone knew how to inizialize the strings and do the gather properly for achieving the objective.

int main(int argc, char *argv[]) {

    int rank;
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    char *string;        // ????????????
    char *rcv_string;    // ????????????

    if (rank == 0)  {
        sprintf(string+strlen(string), "%dr%dg%db%dl\n",255,255,255,0);
    }
    else if (rank == 1) {
        sprintf(string+strlen(string), "%dr%dg%db%dl\n",255,255,255,0);
    }
    else if (rank == 2) {
        sprintf(string+strlen(string), "%dr%dg%db%dl\n",255,255,255,0);
    }
    else if (rank == 3) {
        sprintf(string+strlen(string), "%dr%dg%db%dl\n",255,255,255,0);
    }
    else if (rank == 4) {
        sprintf(string+strlen(string), "%dr%dg%db%dl\n",255,255,255,0);
    }
    else if (rank == 5) {
        sprintf(string+strlen(string), "%dr%dg%db%dl\n",255,255,255,0);
    }

    MPI_Gather(string,???,MPI_CHAR,rcv_string,???,MPI_CHAR,0,MPI_COMM_WORLD);

    if (rank == 0) {
        printf("%s",rcv_string);
    }

    MPI_Finalize();
    return 0;
}
Sergio
  • 844
  • 2
  • 9
  • 26
  • In order to avoid an XY-problem: It is generally much more straight forward to gather around the actual data (e.g. {255, 255, 255, 0}) rather than a C-string. Is there anything in your application that fundamentally requires you to communicate the C-strings rather than the underlying data? – Zulan Feb 15 '17 at 11:15

2 Answers2

1

I managed to reproduce the incorrect behavior where only one partial string is printed.

It is related to your usage of sprintf.

How does C handle char arrays?

When working with arrays in C, you must first allocate memory for it. Dynamic or static, it doesn't matter. Suppose you allocate enough memory for 10 chars.

char my_string[10];

Without initializing it, it contains nonsense characters.

Let's pretend my_string contains "qwertyuiop".

Suppose you want to fill my_string with the string foo. You use sprintf.

sprintf(my_string, "foo");

How does C fill 10 slots with 3 characters?

It fills the first 3 slots with the 3 characters. Then, it fills the 4th slot with an "end of string" character. This is denoted by '\0', which is converted to an "end of string" character when it goes through the compiler.

So, after your command, my_string contains "foo\0tyuiop". If you print out my_string, C knows not to print out the nonsense characters after the \0.

How does this relate to MPI_Gather?

MPI_Gather collects arrays from different processes, and puts them all into one array on one process.

If you had "foo\0tyuiop" on process 0 and "bar\0ghjkl;" on process 1, they get combined into "foo\0tyuiopbar\0ghjkl;".

As you can see, the array from process 1 appears after the "end of line" character from process 0. C will treat all of the characters from process 1 as nonsense.

A patchy solution

Rather than attempting to print all of rcv_string at once, acknowledge that there are "end of string" characters scattered throughout. Then, print out strings with different "start of string" positions, according to the process it came from.

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {

  int rank, size;
  MPI_Init(&argc, &argv);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);

  int part_str_len = 18;

  char *my_string;
  char *rcv_string;

  if ((my_string = malloc(part_str_len*sizeof(char))) == NULL){
    MPI_Abort(MPI_COMM_WORLD,1);
  }
  if ((rcv_string = malloc(part_str_len*size*sizeof(char))) == NULL){
    MPI_Abort(MPI_COMM_WORLD,1);
  }

  sprintf(my_string, "%dr%dg%db%dl\n",255,255,255,0);

  MPI_Gather(my_string,18,MPI_CHAR,rcv_string,18,MPI_CHAR,0,MPI_COMM_WORLD);

  if (rank == 0) {
    printf("%s",rcv_string);
  }

  char *cat_string;
  if ((cat_string = malloc(part_str_len*size*sizeof(char))) == NULL){
    MPI_Abort(MPI_COMM_WORLD,1);
  }

  if (rank == 0){
    int i;
    sprintf(cat_string, "%s", rcv_string);
    for (i = 1; i < size; i++){
      strcat(cat_string, &rcv_string[part_str_len*i]);
    }
  }

  if (rank == 0) {
    printf("%s",cat_string);
  }

  free(my_string);
  free(rcv_string);
  free(cat_string);

  MPI_Finalize();
  return 0;
}
  • 2
    The explanation is good, but the suggested solution sets a bad example by using hardcoded sizes and and `sprintf` / `strcat` as opposed to `snprintf` / `strncat`. This code will probably work as is, but fail very badly once minor things are changed. Also: [don't cast the result of `malloc`](http://stackoverflow.com/a/605858/620382), use `MPI_Abort` instead of `abort`. – Zulan Feb 15 '17 at 08:41
  • Your answer is great, I understood the problem so I give you the answer. But now I've found another problem... The ints appended to the string are not always 255 (could be 1, 20, ...), so the length of the string can be different. This is a problem when we reserve memory for it because I can't do strcat with precission because I need to know the exact length. I think the solution comes by using snprintf/strncat as Zulan says. When I posted the code I didn't realised about this, is my mistake. I don't have a lot of time now, when I can, I open another post. Thanks anyway! – Sergio Feb 15 '17 at 14:25
  • The idea is to reserve memory, for example 20 chars for each process, but I could write only 5 with the process 0, 12 with the process 1, 19 with the process 3, ... Then do a gather (maybe is better using gatherv) and collect and append everything with the root process. Thanks! – Sergio Feb 15 '17 at 14:30
  • Oh yeah. The solution is rather hacked together. It's a patch that will barely get things working. I'll edit the solution to address the casts and `MPI_Abort`. – Jonathan Chiang Feb 15 '17 at 16:51
-1

Try the following:

#define MAX_STR_LEN 100

int main(int argc, char *argv[]) {

    int rank, size;
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    char string[MAX_STR_LEN] = "some string";

    char *rcv_string = NULL;
    if (rank == 0) {
        // Only the master needs to allocate the memory
        // for the result string which needs to be large
        // enough to contain the input strings from `size`
        // peers.
        rcv_string = malloc(MAX_STR_LEN * size);
    }

    ...same code...

    MPI_Gather(string, strlen(string), MPI_CHAR,
               rcv_string, MAX_STR_LEN, MPI_CHAR, 0, MPI_COMM_WORLD);

    if (rank == 0) {
        printf("%s",rcv_string);
        free(rcv_string);
    }

    MPI_Finalize();
    return 0;
}

Running this code with mpirun -n 5 ./a.out produces the following:

some string255r255g255b0l
some string255r255g255b0l
some string255r255g255b0l
some string255r255g255b0l
some string255r255g255b0l

Make sure to define MAX_STR_LEN so that is big enough for your requirements. If the value grows to big you may want to consider heap allocation (i.e. malloc).

simpel01
  • 1,792
  • 12
  • 13
  • This does not work. Using `MAX_STR_LEN` as `recvcount` for `MPI_Gather` will make all strings start `MAX_STR_LEN` apart, with uninitialized values inbetween. Also `rcv_string` is not correctly null terminated after the gather. – Zulan Feb 15 '17 at 08:33