2

I am trying to understand the relationship between:

  • eth0 on the host machine; and
  • docker0 bridge; and
  • eth0 interface on each container

It is my understanding that Docker:

  1. Creates a docker0 bridge and then assigns it an available subnet that is not in conflict with anything running on the host; then
  2. Docker binds docker0 to eth0 running on the host; then
  3. Docker binds each new container it spins up to docker0, such that the container's eth0 interface connects to docker0 on the host, which in turn is connected to eth0 on the host

This way, when something external to the host tries to communicate with a container, it must send the message to a port on the host's IP, which then gets forwarded to the docker0 bridge, which then gets broadcasted to all the containers running on the host, yes?

Also, this way, when a container needs to communicate to something outside the host, it has its own IP (leased from the docker0 subnet) and so the remote caller will see the message as having came from the container's IP.

So if anything I have stated above is incorrect, please begin by clarifying for me!

Assuming I'm more or less correct, my main concerns are:

  • When remote services "call in" to the container, all containers get broadcasted the same message, which creates a lot of traffic/noise, but could also be a security risk (where only container 1 should be the recipient of some message, but all the other containers running on it get the message as well); and
  • What happens when Docker chooses identical subnets on different hosts? In this case, container 1 living on host 1 might have the same IP address as container 2 living on host 2. If container 1 needs to "call out" to some external/remote system (not living on the host), then how does that remote system differentiate between container 1 vs container 2 (both will show the same egress IP)?
smeeb
  • 27,777
  • 57
  • 250
  • 447
  • Look up veth (your concept of "bind" isn't quite right) and NAT with IP tables (masquerade) stuff. – Adrian Mouat Sep 30 '15 at 11:54
  • You can set specific port for you container for connection outside docker host machine. You can find good explanation here http://stackoverflow.com/questions/26539727/giving-a-docker-container-a-routable-ip-address – Victor Sep 30 '15 at 13:21
  • It doesn't create a lot of traffic/noise, it has a NAT and knows exactly where that packet goes. It is a security risk on the default bridge network. This is why Docker created the Network concept. It allows you to segregate traffic to containers that should be communicating: https://docs.docker.com/engine/userguide/networking/dockernetworks/ Nothing bad happens if they choose identical sub nets, in fact they do by default: something like 172.17.255.255. 2 hosts have 2 separate NATs so each is responsible for correct routing. Externally they are using HOST interfaces (eth0) which are unique – Justin Dennahower Apr 08 '16 at 17:21

1 Answers1

0

I won't say that you are clear with the concept of networking in Docker. Let me clarify this part first:

So here's how it goes:

  1. Docker uses a feature of Linux kernel called namespaces to classify/divide the resources.
  2. When a container starts, Docker creates a set of namespaces for that container.
  3. This provides a layer of isolation.
  4. One of these is the "net namespace": Used for managing network interfaces.

Now talking a bit about Network Namespaces:

Net namespace,

  • Lets each container have its own network resources, own network stack.
    • It’s own network interfaces.
    • It’s own routing tables.
    • It’s own iptables rules.
    • It’s own sockets (ss, netstat)
  • We can move a network interface across net namespaces.
  • So we can create a netns somewhere and use it at some other container.
  • Typically: Two virtual interfaces are used, which act as a cross-over cable.
  • Eth0 @ container netNS, which is paired with virtual interface vethXXX in host network ns. ➔ All virtual interfaces vethXXX are bridged together. (Using the bridge docker0)

Now, apart from namespaces, there's a second feature in Linux kernel that makes creation of containers possible: c-groups (or control-groups).

  • Control groups let us implement metering and limiting of:
    • Memory
    • CPU
    • Block I/O
    • Network

TL/DL

In Short: Containers are made possible because of 2 main features of kernel: Namespaces and C-Groups.

Cgroups ---> Limits how much you can use.

Namespaces ---> Limits what you can see.

And you can't effect what you can't see.


Coming back to your question, When a packet is received by the host that is intended for a container, It is encapsulated in layers such that each layer helps the network controller, which peels the packet layer after layer to send it to it's destination. (And similarly while outgoing, the packets are encapsulated layer by layer)

So, I think this answers both of your questions as well.

  1. It's not exactly a broadcast. Other containers can't see a packet that's not related to them (Namespaces).
  2. Since layers are added as a packet goes to the external network, the external layer (Different for different hosts) will help the packet to identify it's recipient uniquely.

P.S.:

If someone find some information erroneous, please let me know in the comments. I have written this in a hurry, will update it with better reviewed text soon.

Thank you.

Community
  • 1
  • 1