1

I have the latest MPICH2 (3.0.4) compiled with intel fort compiler in a quad-core, dual CPU (Intel Xeon) machine.

I am encountering one MPI_bcast problem where, I am unable to broadcast the array

gpsi(1:201,1:381,1:38,1:20,1:7)

making it an array of size 407410920. When I try to broadcast this array I have the following error

Fatal error in PMPI_Bcast: Other MPI error, error stack:
PMPI_Bcast(1525)......: MPI_Bcast(buf=0x7f506d811010, count=407410920,
MPI_DOUBLE_PRECISION, root=0, MPI_COMM_WORLD) failed
MPIR_Bcast_impl(1369).: 
MPIR_Bcast_intra(1160): 
MPIR_SMP_Bcast(1077)..: Failure during collective
rank 1 in job 31  Grace_52261   caused collective abort of all ranks
  exit status of rank 1: killed by signal 9 
MPI launch string is: mpiexec -n 2    %B/tvdbootstrap 
Testing MPI configuration with 'mpich2version'
Exit value was 127 (expected 0), status: execute_command_t::exited
Launching MPI job with command: mpiexec -n 2    %B/tvdbootstrap 
Server args: -callback 127.0.0.1:4142 -set_pw 65f76672:41f20a5c 

So the question, is there a limit in the size of variable in MPI_bcast or is the size of my array is more than what it can handle?

alfC
  • 14,261
  • 4
  • 67
  • 118
Madhurjya
  • 497
  • 5
  • 17
  • Yes John, cutting down the size works. So, what is the appropriate way to broadcast an out-of-size array? What is surprising me that my colleague who is running the same code but a different installation is not getting the error. So, is it implementation dependent? If so, which implementation should allow this? – Madhurjya May 28 '14 at 07:21

2 Answers2

4

As John said, your array is too big because it can no longer be described by an int variable. When this is the case, you have a few options.

  1. Use multiple MPI calls to send your data. For this option, you would just divide your data up into chunks smaller than 2^31 and send them individually until you've received everything.

  2. Use MPI datatypes. With this option, you need to create a datatype to describe some portion of your data, then send multiples of that datatype. For example, if you are just sending an array of 100 integers, you can create a datatype of 10 integers using MPI_TYPE_VECTOR, then send 10 of that new datatype. Datatypes can be a bit confusing when you're first taking a look at them, but they are very powerful for sending either large data or non-contiguous data.

Wesley Bland
  • 8,816
  • 3
  • 44
  • 59
2

Yes, there is a limit. It's usually 2^31 so about two billion elements. You say your array has 407 million elements so it seems like it should work. However, if the limit is two billion bytes, then you are exceeding it by about 30%. Try cutting your array size in half and see if that works.

See: Maximum amount of data that can be sent using MPI::Send

Community
  • 1
  • 1
John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • 1
    The limit is "size of a signed int", which yes for most platforms is 2^31. The "Use MPI datatypes" answer is the correct approach. – Rob Latham May 28 '14 at 19:24