1

I've just updated a few servers to the newer version of the gce stack, and I've started having some ssh issues that I'm not quite sure how to fix. I've already looked into issues with the firewall, and the ssh docs aren't a huge help.

This is how I'm currently connecting:

gcutil --service_version="v1" --project="myproject" ssh  --zone="us-central1-a" "myproject-prod"

which was working up until very recently. I was doing some bash hacking, adding and removing a number of apt and pip packages, so I assume it has something to do with that, but I'm really not sure. When I try to connect with the above code I get the following error:

INFO: Running command line: ssh -o UserKnownHostsFile=/dev/null -o CheckHostIP=no -o StrictHostKeyChecking=no -i /home/user/.ssh/google_compute_engine -A -p 22 user@108.59.84.53 --
ssh: connect to host 108.59.84.53 port 22: Connection refused

My firewalls seem to be in order:

user@computer:~$ gcutil --project="myproject-backend" listfirewalls
+------------------------+---------+
| name                   | network |
+------------------------+---------+
| default-allow-internal | default |
+------------------------+---------+
| default-ssh            | default |
+------------------------+---------+
| http2                  | default |
+------------------------+---------+

Any thoughts or resources on how to resolve this issue?

Slater Victoroff
  • 21,376
  • 21
  • 85
  • 144

3 Answers3

1

I suggest to look into serial console first and check for obvious messages like failed SSH service startup failure. You can also create a snapshot of your boot disk -> create a new Persistent Disk out of it -> mount it on a temp instance and review logs/startup scripts, etc.

PrecariousJimi
  • 1,503
  • 8
  • 20
  • Already went through all of the suggested steps in that guide, including remounting the persistent disk, which resulted in connection timeouts. Looks like the issues here really is on the disk rather than an instance problem, which is even more confusing. I've replaced the disk at this point though, as this was an urgent issue. – Slater Victoroff Feb 27 '14 at 23:38
0

Is the VM connected to the default network?

If not, is the ssh-key that you are using (/home/user/.ssh/google_compute_engine) entered into the metadata section for the VM?

Jed Daniels
  • 24,376
  • 5
  • 24
  • 24
  • Yup, am indeed connected to the default network. Correct me if I'm wrong, but I believe that problem would be fixed by simply remounting the persistent disk as suggested in the linked docs, but as I mentioned the problem persisted (not to make a horrible pun or anything.) – Slater Victoroff Mar 01 '14 at 09:00
  • Simply reconnecting the storage on the VM won't help if the problem is that the key doesn't exist or isn't in the correct place on the client (or if it is the wrong key). That is why I suggested checking that the one you are using actually matches the one in the metadata. – Jed Daniels Mar 02 '14 at 19:27
0

We have created a startup script to self-manage and troubleshoot ssh connectivity issues https://github.com/GoogleCloudPlatform/compute-ssh-diagnostic-sh/

Feczo
  • 608
  • 5
  • 8