1

I have a Ubuntu 14.04 Vagrant box. It seems to work fine most of the time, however every couple of days I'll get a problem. When Vagrant is trying to ssh into the machine, it gets a

Warning: Remote connection disconnect. Retrying...

This is printed every few seconds for a while until I'm given an error saying it timed out waiting for the machine to boot. I found this question which suggested booting into the gui with:

config.vm.provider :virtualbox do |vb|
  vb.gui = true
end

to see what was holding the VM up. It looks like the VM is getting stuck calling url_helper.py:

url_helper.py[WARNING]: Calling 'http://169.254.168.254/2009-04-04/meta/instance-id'
failed [101/120s]:request error [(<urllib3.connectionpool.HTTPConnectionPool object 
at 0x7ff2d6691450>, 'Connection to 169.254.168.254 timed out. 
(connect timeout=50.0)')]
url_helper.py[WARNING]: Calling 'http://169.254.168.254/2009-04-04/meta/instance-id'
failed [119/120s]:request error [(<urllib3.connectionpool.HTTPConnectionPool object 
at 0x7ff2d6682f50>, 'Connection to 169.254.168.254 timed out. 
(connect timeout=50.0)')]
DataSourceEc2.py[CRITICAL]: Giving up on md from 
['http://169.254.168.254/2009-04-04/meta/instance-id'] after 120s

This continues getting printed out until Vagrant fails:

<machine name>: Warning: Remote connection disconnect. Retrying...
<machine name>: Warning: Remote connection disconnect. Retrying...
<machine name>: Warning: Remote connection disconnect. Retrying...
Timed out while waiting for the machine to boot. This means that Vagrant was 
unable to communicate with the guest machine within the configured 
("config.vm.boot_timeout" value) time period.

If you look above, you should be able to see the error(s) that Vagrant had 
when attempting to connect to the machine. These errors are usually good hints 
as to what may be wrong.

If you're using a custom box, make sure that networking is properly working and 
you're able to connect to the machine. It is a common problem that networking 
isn't setup properly in these boxes. Verify that authentication configurations 
are also setup properly, as well.

If the box appears to be booting properly, you may want to increase the timeout 
("config.vm.boot_timeout") value.

but then a little while later will continue onto the login screen. At this point I can log in or ssh in, except some things aren't right. ie. the shared folders aren't mounted. The only solution I can find to this is to destroy and re-up the machine:

vagrant destroy -f && vagrant up

Which can take a while depending on the provisioning of the machine.

Has anyone else come across this problem before? Any suggestions on how to fix this?

Here is my current Vagrantfile:

VAGRANTFILE_API_VERSION = "2"

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
  config.vm.box = "trusty64"
  config.vm.box_url = "/media/resources/Development/Vagrant/Boxes/trusty64.box"

  config.vm.define "name" do |name|
    name.vm.hostname = "name"
    name.vm.network :private_network, ip: "192.168.45.12"
    name.vm.network :forwarded_port, guest: 7070,  host: 7070
    name.vm.network :forwarded_port, guest: 7443,  host: 7443

    name.vm.provider :virtualbox do |vb|
      vb.name = "name"
      vb.customize ["modifyvm", :id, "--memory", "2048"]
    end
  end

  config.vm.provision "ansible" do |ansible|
      ansible.playbook = "provisioning/playbook.yml"
      ansible.sudo = true
      ansible.verbose = "v"
      ansible.extra_vars = "@provisioning/user_vars.yml"
  end
end

By the way, the other answer to the linked question above (sending the enter key to the virtual machine through vboxmanage didn't work since I have a different problem to the asker of that question.)

UPDATE: Following @ElfElix's advice I had a look in /etc/network/interfaces and saw this:

#VAGRANT-BEGIN
#The contents below are automatically generated by Vagrant. Do not modify
auto eth1
iface eth1 inet static
    address 192.168.45.11
    netmask 255.255.255.0
#VAGRANT-END

Although, in /etc/network/interfaces.d/eth0.cfg is:

# The primary network interface
auto eth0
iface eth0 inet dhcp

I tried removing this file to disable DHCP, but that just made it worse. Not only did the vagrant up fail, because it couldn't connect, but I was unable to ssh in after waiting a while.

Does anyone have any other ideas?

UPDATE II: It appears that the VM will only start after the initial vagrant up which will create and provision the machine. After halting the machine, it will no longer come up correctly since it encounters the above error.

Community
  • 1
  • 1
Yep_It's_Me
  • 4,494
  • 4
  • 43
  • 66
  • two things: Do not add screenshots of logs, copy paste the data so that it can be useful (eg searchable). I don't like the fact that the ip is "169.254.168.254", IPs of 169.254.x.x range are used in special occasions of network problems – xlembouras Sep 26 '14 at 06:07
  • Interesting - I am on cloudinit v20 getting the same problem 6 years later trying to get Packer to build an AMI on AWS. Note that your critical error was a `DataSourceEc2.py` based error which, together with the magic IP (169.254.169.254) suggested a misconfig of cloudinit in the image. I'm no further forward though as all bug reports I see are for very early cloudinit versions. – volvox Dec 26 '20 at 20:05

1 Answers1

0

If the machines ip is set by DHCP try setting it to be Static and set it up manually. The reason it keeps doing this every few days is that the DHCP has to check the ip and assign it again.

A virtual machine, server, ftp server, etc. should always have a static IP for being remotely connected to. As xlembouras said "IPs of 169.254.x.x range are used in special occasions of network problems," so what I am gathering is the DHCP server has assigned it an invalid IP (or an old one). So you'll have to make it static and set up the IP, (subnet) mask, and gateway on the machine manually. This can be done by on an ubuntu server (or *buntu machine) by doing the following (from terminal): (NOTE: you can use sudo instead of su if you prefer. NOTE: you can use VIM if you prefer)

sudo su
cd /
cd etc/network
vi interfaces

by default DHCP should be setup (even if it isn't in the text). If you see a line that says

"iface eth0 inet dhcp"

replace the "dhcp" with static and proceed to setup below it the IP and other stuff.

Example (do not use this verbatim as it is an example) :

iface eth0 inet static
address 192.168.1.2
netmask 255.255.255.0
gateway 192.168.1.254

Now restart the network config(s) via

/etc/init.d/networking restart

If you'd like a detailed guide I'll save you the trouble of googling it, Change DHCP IP to Static IP on Linux

Vincent
  • 7
  • 5
  • To clarify the DHCP thing. The DHCP server assigns your machine an IP, and this IP is valid for X amount of time. After X amount of time has passed then the computer has to ask the DHCP server to check its ip address and validate it, or give it a new one. You must set it up as static so that it remains the same. This way the machine's IP doesn't change weekly (or whatever rate your DHCP server assigns IPs). Dynamic IP addresses (such as those assigned by a DHCP server) will change, and static will remain the same. – Vincent Sep 28 '14 at 10:03
  • Is there anyway to do this within the Vagrantfile? Ideally I would like all the required setup to be in there. – Yep_It's_Me Sep 28 '14 at 23:06
  • I added my Vagrantfile to the question. I don't think it's being set with DHCP since I'm manually setting the IP address. – Yep_It's_Me Sep 29 '14 at 01:43
  • If setting it by the vagrant file hasn't worked then it might be time to change the config within the buntu machine, because buntu and VMWare may be trying to do two different things at once and causing an error, or maybe the buntu machine is using its own instead of the VM's/Vagrant and thus contradicting. I've never done it from the vagrant file so I am not explicitly knowledgeable in that area. What I am guessing though, is that, buntu in its interfaces file it is telling the pc to acquire an IP address from a/the DHCP server while your VMBox is being told to get the static ip – Vincent Sep 30 '14 at 00:42
  • I didn't say to change anything in the eth0 file. You should back it up and boot it like before. I think this is an issue with your internet rather than DHCP/Static since it is set in the interfaces file. Beyond me at this point. – Vincent Oct 04 '14 at 14:33