1

I have three virtual machines with IP addresses of 192.168.0.170, 192.168.0.171, and 192.168.0.172. The gateway in the host machine has the IP address of 192.168.0.7. Due to my problem, described here, these VMs are only able to ping each other via gateway 192.168.0.7.

I created a hostfile as:

192.168.0.170 slots=2
192.168.0.171 slots=5
192.168.0.172 slots=5

And then when I run it by using mpirun -np 4 --hostfile hostfile echo Hello, I get this error:

Open MPI detected an inbound MPI TCP connection request from a peer
that appears to be part of this MPI job (i.e., it identified itself as
part of this Open MPI job), but it is from an IP address that is
unexpected.  This is highly unusual.

The inbound connection has been dropped, and the peer should simply
try again with a different IP interface (i.e., the job should
hopefully be able to continue).

  Local host:          192.168.0.171
  Local PID:           3143
  Peer hostname:       192.168.0.171 ([[46370,1],0])
  Source IP of socket: 192.168.0.7
  Known IPs of peer: 

In my opinion, this error happens because when 192.168.0.170 as the master node for example wants to communicate with 192.168.0.171, it sends IP packet through 192.168.0.7 as a middleman. So, since 192.168.0.7 is not listed on the hostfile, MPI detects an inbound connection that is not expected.

Any way to avoid this problem? A silly workaround is to include 192.168.0.7 in the hostfile, but it doesn't make sense to me. Any help is much appreciated. Thanks!

0 Answers0