loads of similar questions and answers are floating out there - many seem like they might help but none do.
First, my situation / caveats: I am a researcher using publicly available data and I do not care one iota about security. I'm only concerned with getting my application up and running to do the math I need.
Second, what I'm doing: I will be running an asynchronous MPI algorithm on an HPC cluster. Right now I'm simply trying to automatically provision that cluster.
Third, Platform: I am using Microsoft Azure virtual machines (not Batch) running CentOS 7.6 and IntelMPI. I am using a custom image, which is unchanged from the stock Azure image except for the pre-installation of sshpass.
Fourth, the goal: In order for the VM's in the cluster to communicate, they require password-less ssh. I can set everything up manually without any trouble. But as n, the size of the cluster, grows, affirming every connection with a password grows by n square. O(n^2). So a cluster of 10 requires 100 password inputs. 20 requires 400, etc. Therefore this must be done in a script.
Fifth, the problem: Initial set up of passwordless ssh from a script is working fine, BUT the first connection between every pair of machines still requires a password. Subsequent connections do not. It is THIS initial connection I am trying to make in my script without a password - and am failing. Some tutorials / answers online don't even recognize that an initial connection must be made using the password (e.g. https://netbeez.net/blog/connect-to-ssh-without-password/). Perhaps, because the number of connections grow linearly in their application, they simply ignore it as a minor inconvenience.
Here is a simple script that illustrates the problem.
#!/usr/bin/bash
ssh -o StrictHostKeyChecking=no $connectHostName #Of course this asks for a password.
#The following options don't work either
#echo -e "${pswd}\n" | ssh -o StrictHostKeyChecking=no $connectHostName
#sshpass -p "${pswd}" ssh -o StrictHostKeyChecking=no $connectHostName
Before running this script all key pairs were generated and copied to the other VMs in a previous script. That script is working fine without requiring password intervention. Here is an excerpt:
#..........
#Adds shared username to the group wheel
echo -e "${pswd}\n" | sudo -S usermod -aG wheel "${user}"
#Generates an ssh key pair for the machine running the script
echo | ssh-keygen -t rsa -P ''
#Looping through list of IP's
i=1
myHostName=`uname -n`
while IFS= read -r IP; do
thisHostName=$(cat hosts | sed -n "${i}p" )
let "i++"
if [ "$myHostName" == "$thisHostName" ]; then
continue
fi
sshpass -p "${pswd}" ssh-copy-id -o StrictHostKeyChecking=no "${user}@${IP}"
done < Init_IPs
i=1 #resetting in case I use it later
# Loop Ends Here
echo -e "${pswd}\n" | eval `ssh-agent`
echo -e "${pswd}\n" | ssh-add ~/.ssh/id_rsa
#......