34

After I restarted my windows i cannot connect to docker machine running in Oracle Virtual Box. When i start Docker QuickStart Terminal every thing looks fine, it's coming up OK and it gives me this message:

docker is configured to use the default machine with IP 192.168.99.100
For help getting started, check out the docs at https://docs.docker.com

but when i do:

$ docker-machine ls
NAME      ACTIVE   DRIVER       STATE     URL   SWARM   DOCKER   ERRORS
default   -        virtualbox   Timeout

and:

λ docker images
An error occurred trying to connect: Get http://localhost:2375/v1.21/images/json: dial tcp 127.0.0.1:2375: ConnectEx tcp: No connection could be made because the target machine actively refused it.

also when i try to reinitialize my env., i get:

λ docker-machine env default
Error checking TLS connection: Error checking and/or regenerating the certs: There was an error validating certificates for host "192.168.99.100:2376": dial tcp 192.168.99.100:2376: i/o timeout
You can attempt to regenerate them using 'docker-machine regenerate-certs [name]'.
Be advised that this will trigger a Docker daemon restart which will stop running containers.

BTW, Regenerating certs also not helping. Any idea?

Thanks.

Hazhir
  • 780
  • 1
  • 6
  • 18
  • I get this problem every few days. I have to delete the default machine then recreate all my images from scratch. Its a royal pain. Hence offering a bounty. – rmcsharry Mar 21 '16 at 11:04
  • Debug output from my machine, which might help: https://github.com/rmcsharry/debug-docker-machine/blob/master/debug%20output.txt – rmcsharry Mar 21 '16 at 11:18
  • I think the i/o timeout is critical. I suspect this problem is caused by networking trouble. Try `netstat -rn` and look for routes to network 192.168.99, for example: `192.168.99 link#18 UC vboxnet !` The machine needs a route to the vbox0 host-only network. – chrisinmtown Apr 18 '22 at 15:03

16 Answers16

17

Please try regenerating certificates manually by:

docker-machine --debug regenerate-certs -f default

and check for any errors to fix, then try again:

docker-machine --debug env default

If it's failing on ssh, copy and paste that command into terminal to see what's the problem by adding extra -vv.

If you've got:

debug1: connect to address 127.0.0.1 port 64368: Connection refused

then your machine isn't running (check by docker-machine ls), so try:

docker-machine start

Then try to ssh to it via:

docker-machine -D ssh default
kenorb
  • 155,785
  • 88
  • 678
  • 743
  • Thanks, but that does not solve it for me. My machine is started, regenerating certs does not help either. – rmcsharry Mar 21 '16 at 11:18
  • @rmcsharry If machine is running (`docker-machine ls`), can you ssh to it via `docker-machine -D ssh default`? If you can't what error do you have? – kenorb Mar 21 '16 at 11:41
  • I could not spend any more time, so I nuked the default container and recreated it. Next time it happens (probably in about 3-4 days) I will try this and post here. – rmcsharry Mar 21 '16 at 13:26
  • 1
    docker-machine --debug regenerate-certs -f name_of_your_vm – aurelius Mar 07 '19 at 15:25
10

After doing some research I found out that following workaround may solve the issue for now:

  1. Open Network And Sharing Center

  2. Click on Change Adapter Setting

  3. See if you have any enabled adapters such as VPN or VM Ware network adapters.

  4. Try to disable them and try to connect to your container one more time

  5. If it didn't work while you have other adapters disabled, Restart your PC - in my case this worked for me.

Mogsdad
  • 44,709
  • 21
  • 151
  • 275
Hazhir
  • 780
  • 1
  • 6
  • 18
  • 2
    2n time I tried that, it did work. Next time it happens I will try again to see if is repeatable. – rmcsharry Mar 30 '16 at 10:05
  • This also worked for me after restarting the PC. When I disabled the VirtualBox adapters, new ones are created after restart. – oskansavli Sep 24 '19 at 09:48
7

What worked for me is this answer from the docker-machine repo:

docker-machine regenerate-certs --client-certs [name]

Basically, what expired is client certificates. The error message I get from docker-machine is similar to yours (i.e., no indication it's the client certs that need to be regenerated).

munsu
  • 1,914
  • 19
  • 24
6

I fix it doing this:

  • Removed all host-only interfaces from my VirtualBox (VirtualBox → Preferences → Network → Host-only networks).
  • rmdir.exe --ignore-fail-on-non-empty ~/.docker/
  • docker-machine start
  • docker-machine env
  • eval $("C:\Program Files\Docker Toolbox\docker-machine.exe" env default) (added also at the end of my .bash_profile).
  • docker run hello-world ← now working

Inspired in this post.

Pablo Bianchi
  • 1,824
  • 1
  • 26
  • 30
1

Here is what worked for me. The first steps are similar to what Hazhir proposed, then followed by regenerate the certificates.

  1. Open Network And Sharing Center.
  2. Click on Change Adapter Setting.
  3. Disable all active VMWare network adapters. Usually has explanation "VirtualBox Host-Only Ethernet Adapter".
  4. Connect to your container by running docker-machine start.
  5. Run docker-machine env. If you're like me then you'd get following error:

Error checking TLS connection: Error checking and/or regenerating the certs: There was an error validating certificates for host "192.168.99.100:2376": x509: certificate is valid for 192.168.99.101, not 192.168.99.100

Which is good. Now all we need to do is to run

docker-machine regenerate-certs -f default

Then test it again with docker-machine env. If you get:

SET DOCKER_TLS_VERIFY=1
SET DOCKER_HOST=tcp://192.168.99.100:2376
SET DOCKER_CERT_PATH=C:\Users\Jay\.docker\machine\machines\default
SET DOCKER_MACHINE_NAME=default
REM Run this command to configure your shell:
REM     FOR /f "tokens=*" %i IN ('docker-machine env') DO %i

Then you're all set. In my case I needed to start my virtual machine by running Docker Quickstart Terminal.

jaycode
  • 2,926
  • 5
  • 35
  • 71
1

I have this problem too. Execute docker-machine regenerate-certs <vm-name> can not solve problem. I search Google the error info and find the solution below.

  • execute sudo ifconfig vboxnet0 up in terminal.
  • show docker machine state: docker-machine ls.
  • now STATE and URL are ok.

But restart the system this problem persists.

GitHub issues link I found is here.

It seems there is a bug in VirtualBox 5.1.24.

invzhi
  • 46
  • 1
  • 3
1

Just start the docker machine and then regenerate certificates

docker-machine start <machine-name>

docker-machine regenerate-certs <machine-name>

It works like a charm for me.

Asad Shakeel
  • 1,949
  • 1
  • 23
  • 29
1

None of the answers here helped me. My problem occurred when I want to activate the shell of my virtual machine with eval $(docker-machine env default).

It was then trying to access the port 2376 which was closed, so I had to enter the shell of the VM through ssh and activate the following UFW rule:

sudo ufw allow 2376
0

The way I ensure being able to connect to my docker machines is by assigning them a fixed IP (and regenerating the certs only once) (no reboot needed)

After that, docker-machine ls always work.

My current script:
(replace %PRGS%\dm\latest by the path where docker-machine.exe is on your machine)
(make sure PATH include the latest /path/to/git/usr/bin, for commands like ssh to be available)

> more dmvbf.bat
@echo off
setlocal enabledelayedexpansion
set machine=%1
if "%machine%" == "" (
        echo dmvbf expects a machine name
        exit /b 1
)
set ipx=%2
if "%ipx%" == "" (
        echo dmvbf x missing ^(for 192.168.x.y^)
        exit /b 2
)
set ipy=%3
if "%ipy%" == "" (
        echo dmvbf y missing ^(for 192.168.x.y^)
        exit /b 3
)

%PRGS%\dm\latest\docker-machine.exe ssh %machine% "sudo sh -c 'echo \"kill \$(more /var/run/udhcpc.eth1.pid)\" | sudo tee /var/lib/boot2docker/bootsync.sh >/dev/null'"
%PRGS%\dm\latest\docker-machine ssh %machine% "sudo sh -c 'echo \"ifconfig eth1 192.168.%ipx%.%ipy% netmask 255.255.255.0 broadcast 192.168.%ipx%.255 up\" | sudo tee -a /var/lib/boot2docker/bootsync.sh >/dev/null'"

%PRGS%\dm\latest\docker-machine ssh %machine% "sudo chmod 755 /var/lib/boot2docker/bootsync.sh"

%PRGS%\dm\latest\docker-machine ssh %machine% "sudo cat /var/run/udhcpc.eth1.pid | xargs sudo kill"

%PRGS%\dm\latest\docker-machine ssh %machine% "sudo ifconfig eth1 192.168.%ipx%.%ipy% netmask 255.255.255.0 broadcast 192.168.%ipx%.255 up"

For instance:

dmvbf default 99 100
docker-machine regenerate-certs -f default

That will assign 192.168.99.100 to the docker machine 'default', and regenerate the certs once.
Then each time docker-machine ls is called, it will display the same IP for 'default'.

Community
  • 1
  • 1
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • Hi @VonC, I already use your script - I posted on the thread you originally added it to. But for some reason my default docker machine still goes to a 'timeout' every few days. I think it's related to the host going to sleep or being turned off, but am not sure. I would just love for some way to fix the timeout when it occurs that doesn't involve rebuilding my docker containers from scratch every time. – rmcsharry Mar 21 '16 at 13:23
  • @rmcsharry in that case, open VirtualBox to check the status of the VM. – VonC Mar 21 '16 at 13:25
  • @rmcsharry in other words, is your VM is an unusual state, like "`aborted`", or "`guru meditation`"? – VonC Mar 21 '16 at 13:45
  • No, the VM is fine and I can start/stop it from VirtualBox no problem – rmcsharry Mar 21 '16 at 14:06
  • @rmcsharry can you try again the script I mention in the answer: it is different from the one posted on the original thread. – VonC Mar 22 '16 at 07:01
  • it's happened again. I did nothing except shutdown the VM, shut down the PC. Turn it back on and it shows the Timeout. I tried your script (had to change the last 5 lines to point to "C:\Program Files\Docker Toolbox\docker-machine.exe" instead of %PRGS%\dm\latest\docker-machine. Sadly the problem still persists. – rmcsharry Mar 23 '16 at 09:17
  • @rmcsharry I generally see a timeout right after the script is executed. Try a docker-machine ssh yourMachine then check in another windows if docker-machine ls still display the timeout. – VonC Mar 23 '16 at 09:19
  • After I ran your script it didn't work. Tried shutting down the VM and restarting the DQS terminal, still had the Timeout. So I manually deleted the host-adapter, ran the QuickStart to let it re-create it, as I figured it something to do with the host-adapter not being correct (or the default machine's network settings). Anyway that did not help. – rmcsharry Mar 23 '16 at 09:45
  • So now I just shutdown everything to try from a clean boot of the host PC. Now when I started the Docker QuickStart Terminal it had to ask permission to create the host network adapater and the DHCP server again. I guess I must have corrupted the existing one somehow (when trying to fix things). Now the default VM is running with no timeout, no errors, on the correct IP. – rmcsharry Mar 23 '16 at 09:46
  • @rmcsharry forget quick-start, you don't need it. A simple docker-machine create, followed by the script, followed by docker-machine ssh is enough. – VonC Mar 23 '16 at 09:46
  • but I don't want to do docker-machine create - that recreates everything from scratch. I have to recreate postgres, mysql, my storage container, redis - all of which it re-downloads every time into the default machine – rmcsharry Mar 23 '16 at 09:48
  • @rmcsharry I understand. I am used to recreate from scratch and let the Dockerfile pull everything I need ;) – VonC Mar 23 '16 at 09:52
  • I have 4 images in the default machine. Every time this timeout happens I have to delete the default machine and recreate it, which means I have to recreate those 4 images (and have to reinstall my databases from backups). – rmcsharry Mar 23 '16 at 09:52
  • @rmcsharry but after restarting from a clean boot, and recreate the host network adapater and the DHCP server, do you still have the issue? – VonC Mar 23 '16 at 09:54
  • No. But other times from a clean boot I see the timeout. So now I am wondering if the solution to this problem is to just delete the host network adapter in Virtual Box, then reboot and rerun the quickstart. Next time it happens I will try that. – rmcsharry Mar 23 '16 at 09:55
  • @rmcsharry OK. Are you running the latest Oracle Virtualbox? – VonC Mar 23 '16 at 09:56
  • Version 5.0.14 r105127 – rmcsharry Mar 23 '16 at 10:33
0

Try this way/workaround:

  • firstly make sure there are ca.pem, cert.pem, key.pem, ca-key.pem under $yourhome/.docker/machine/certs/ folder , for these lost four *.pem files, you can copy them from other places or maybe create them yourselves ( these four pem files are surely not correct at the beginning )
  • make sure the env set correctly in bash_profile, like: export DOCKER_HOST=tcp://192.168.99.100:2376 export DOCKER_MACHINE_NAME=default export DOCKER_TLS_VERIFY=1 export DOCKER_CERT_PATH=/Users/johnwang/.docker/machine/machines/default
  • rerun the cmd: docker-machine regenerate-certs default (maybe before run this, you need reopen the docker terminal) Tried on docker toolbox on mac, and it works.
  • Finally some logs of the result: Error checking TLS connection: Error checking and/or regenerating the certs: There was an error validating certificates for host "192.168.99.100:2376": x509: certificate signed by unknown authority You can attempt to regenerate them using 'docker-machine regenerate-certs [name]'. Be advised that this will trigger a Docker daemon restart which might stop running containers. ... ... johns-MacBook-Pro:certs johnwang$ docker-machine regenerate-certs default Regenerate TLS machine certs? Warning: this is irreversible. (y/n): y Regenerating TLS certificates Waiting for SSH to be available... Detecting the provisioner... Copying certs to the local machine directory... Copying certs to the remote machine... Setting Docker configuration on the remote daemon... johns-MacBook-Pro:certs johnwang$ docker-machine ls NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS default - virtualbox Running tcp://192.168.99.100:2376 v17.03.1-ce

Hope it helps also see my response here:https://github.com/docker/machine/issues/2808

john
  • 37
  • 1
0

In my case it was my FortiClient that caused the issue. After disabling it docker-machine env default worked fine again. I suggest you to check if there's any anti-virus program running in your system.

Guster
  • 1,733
  • 1
  • 16
  • 18
0

for me, running

docker-machine --debug regenerate-certs -f name_of_your_vm

worked just fine.

docker-machine version 0.16.1
virtualBox 6.0

also docker was configured to use the default machine with IP 192.168.99.100

aurelius
  • 3,946
  • 7
  • 40
  • 73
0

I had the same error. I fixed it by open tcp port 2376 in network firewall.

  • 1
    Do you mean one of those ports (https://twitter.com/bad_packets/status/1199088838636793857), because... https://twitter.com/sudo_bmitch/status/1199112656810061824 – VonC Nov 26 '19 at 05:34
0

The solution for my problem is taken from here: https://github.com/docker/machine/issues/3845#issuecomment-271935924

Quote:

If you install docker-machine first time then you do not have in that host a self-signed CA that will be used to generate your client certificate and as many server certificates as machines you generate later on. That CA is generated when you try to create a machine if that CA is not yet created. So if you try to generate several servers in parallel (by means of an script), then you’ll generate as many self-signed (root) CA as docker createcommands, all of them being written in the same location that seems to be messing up the environment e.g. spreading out different ca.pem to the remote machines that do match the final version, causing the cert.pem (host identity) to be signed by a former ca.pem which no longer exist… or whatever other abnormal situation.

To fix it, first of all you'll need to delete your existing self-signed CA. This can be done by removing the folder ~/.docker/machine/certs (NOTE: Note this will force the creation of a new self-signed CA for docker-machine to use and will yield your existing machines to fail connecting to the daemon). This will make your docker-machine to generate valid certificates again. Then, for my use case I am creating the first machine in foreground and all the rest of them are done in parallel. That will cause the creation of one root self-signed CA in isolation and then will be used for further docker-machine create commands. It worked like a charm!

The reason why I was able to ssh to the host is because there are a different pair of keys for sshing generate per host that was not bitten by this.

To sum up, this is what I ended up doing:

  1. Find out what is the command that docker-machine is running. I was using it with gitlab-runner, So I had to run gitlab-runner in debug mode to see what command was it running on docker-machine.

  2. then stop gitlab-runner: gitlab-runner stop

  3. then delete the certificate: rm -rf ~/.docker/machine/certs

  4. then run a single command (from step #1) to re-create the certs (remember - the reason this didn't work is because it was trying to create it multiple times)

  5. then rerun gitlab-runner: gitlab-runner start

Worked for me!

Alon Gouldman
  • 3,025
  • 26
  • 29
0

For reader using brew in 2021, after your somehow upgrade virtualbox cask

  1. System Preferences... > Security & Privacy > (Unlock with finger) Allow.
    <<Your Computer Should Restart>>.
  2. docker-machine restart default. Done
NeoZoom.lua
  • 2,269
  • 4
  • 30
  • 64
0

Solved this issue in MacOS by installing Docker Desktop

  1. brew uninstall docker
  2. brew uninstall docker-machine
  3. Then download Docker Desktop for mac https://docs.docker.com/desktop/mac/install/
Oways
  • 1