0

Can a process running inside a Cloud Foundry app container be enabled to create a TCP connection to a port opened by a process running on the Diego Cell hosting the container? If so, are there differences between build pack based and Docker image based app containers?

Our use case is passing traces to an agent deployed on the Diego Cell.

I tried by using the CF_INSTANCE_IP address and various alternatives in combination with suitable security groups, but can't get this to work. Frankly, I am not even sure how to address the Diego Cell host best. Inside the containers, I can see metrics scraping requests that the agent runs against the container, they are reported to come from IP 169.254.0.1 (which to my understanding is the address of the virtual router that Cloud Foundry puts into each container). Logs from the agent for the same requests report that the agent is contacting the container using the CF_INSTANCE_INTERNAL_IP.

What really surprises me is that the apps can open TCP connections to the port in question on all other Diego Cells in the cluster when using the IP address reported by CF_INSTANCE_IP in containers running on these other cells. The one connection that does NOT work is the one to the agent port on its own Diego Cell.

Any pointers/help appreciated. We are running our own CF installation based on the Open Source version.

Jan
  • 341
  • 5
  • 15
  • I'm not going to say it's impossible, cause where there is a will there is a way, but it's definitely not the intent. The intent of the application containers on CF is to isolate them from the host as much as possible because of security. You don't want apps to be able to impact the host or other app containers on the host. For traces and instrumenting, you can do that with application agents or sidecars. You might even be able to instrument from the host down into containers, since the host has visibility into containers but idk what kind of integration w/CF metadata that would have. – Daniel Mikusa Jun 27 '23 at 03:23
  • @DanielMikusa: What you propose is what we do for metrics -- the agent on the Diego cell scrapes the metric endpoints of containers. For traces there is no way to go like this, however. We want to avoid running dedicated agents in each of the containers as sidecars because we really have a lot of those and because of the additional effort involved. – Jan Jun 27 '23 at 06:37
  • I ran some more tests and made a (for me) surprising discovery: Apps can talk to the port in question on all other Diego cells in the cluster using the respective host address as stored in CF_INSTANCE_IP of the other Diego cells. It is just their own cell that they can not talk to. Will update the question to reflect this. – Jan Jun 27 '23 at 06:40
  • Not sure the way your tracing app is structured/architected, but you might be able to run it as a stand-alone app on CF that is on the internal network. You could then have other apps, send traces to that over the internal network. I've seen something similar work with Datadog & tracing. – Daniel Mikusa Jun 27 '23 at 20:50
  • Yes, that is an option that works. But it means more moving parts, dependencies etc. And yes, we are talking about Datadog here. Would still like to understand how to reach the local Diego cell host given the fact that I can reach remote Diego cell hosts. – Jan Jun 28 '23 at 12:23
  • 1
    I believe it's part of the iptables rules that get generated for each container. It has been a long time since I've dug around there though. Maybe try dumping the firewall rules on one of your Diego cells and checking if that's still the case. – Daniel Mikusa Jun 29 '23 at 13:43
  • Thanks, @DanielMikusa for all your input. Will try to get our CF admins to do so. Can you point me roughly to where in the Cloud Foundry sources I can find the code that sets up these rules? – Jan Jun 29 '23 at 20:54
  • Not sure, sorry. – Daniel Mikusa Jun 30 '23 at 03:35

1 Answers1

0

It turned out that this is possible using the following steps:

  • Configure the Garden jobs on the Diego cells with allow_host_access: true * Configure silk-daemon jobs on the Diego cells with host_tcp_access: [10.0.0.0/16:<targetPort>]
  • Set up a suitable application security group to allow access from containers to the target port.
Jan
  • 341
  • 5
  • 15