0

I am receiving an java.net.UnknownHostException: postgres-service on a machine where I can ping postgres-service on the command line. This is in the context of Kubernetes (more specifically GKE) services and Docker images. Could it be that Java requires additional packages (in comparison to ping) to be installed before it can resolve symbolic IP addresses such as postgres-service? I meanwhile guess the answer is no, and that the problem lies with resolving postgres-service via kube-dns is this particular situation (see UPDATE).

UPDATE The evidence (including the stacktrace below) suggests that the exception is triggered when Tomcat 9 tries to set-up a JDBC realm with connectionURL="jdbc:postgresql://postgres-service/mydb". The URL is configured in the context descriptor of a web app, which runs inside a Docker image derived from tomcat:9. The context descriptor is generated by a script configured as the image's ENTRYPOINT, which also starts Tomcat (just like the original tomcat:9 does), i.e. the last few lines of the Dockerfile look as follows:

COPY tomcat-entrypoint.sh /
ENTRYPOINT [ "/tomcat-entrypoint.sh" ]
CMD ["catalina.sh", "run"]

I can ping postgres-service after entering a shell with kubectl exec -it <image> bash. Could it be that Tomcat (when run as the image's "single process" with pid 1 by way of the Dockerfile's CMD) sees a different DNS configuration than bash that runs at its sibling? The actual DNS configuration employs kube-dns, as is apparent from /etc/resonf.conf.

org.postgresql.util.PSQLException: The connection attempt failed.
    at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:280)
    at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:66)
    at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:211)
    at org.postgresql.Driver.makeConnection(Driver.java:407)
    at org.postgresql.Driver.connect(Driver.java:275)
    at org.apache.catalina.realm.JDBCRealm.open(JDBCRealm.java:661)
    at org.apache.catalina.realm.JDBCRealm.startInternal(JDBCRealm.java:724)
    at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:152)
    at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5054)
    at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:152)
    at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:724)
    at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:700)
    at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:734)
    at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:596)
    at org.apache.catalina.startup.HostConfig$DeployDescriptor.run(HostConfig.java:1805)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.UnknownHostException: postgres-service
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at org.postgresql.core.PGStream.<init>(PGStream.java:64)
    at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:150)
    ... 19 more
Drux
  • 11,992
  • 13
  • 66
  • 116
  • It depends on your name resolution configuration. Also, what code triggers this exception? – fge Aug 26 '16 at 21:58
  • @fge I've added more information re the code that triggers the execption. The name resolution configuration is provided by Kubernetes 1.3, if I am not mistaken. – Drux Aug 27 '16 at 04:40
  • You say that in /etc/resolv.conf you have an entry for kube-dns; can you resolve the address of this server from your VM/container/whatever? – fge Aug 27 '16 at 14:59
  • @fge Yep, e.g. by executing `ping postgres-service` from inside the other pod. – Drux Aug 27 '16 at 15:03
  • I was not talking about this, sorry for the misinterpretation; I was wondering whether you could resolve the address of `kube-dns` itself. It may be that the `ping` program and Java use a different name configuration. – fge Aug 27 '16 at 16:17
  • @fge The pod's `/etc/resolve.conf` refers to a nameserver by a numeric IP address. `nslookup` indicates this IP address is equivalent to `kube-dns.kube-system.svc.cluster.localkube-dns.kube-system.svc.cluster.local.` Thx for your support. – Drux Aug 27 '16 at 19:22
  • Glad to see you solved your problem... The FQDN of that machine is very strange though (looks like `kube-system.svc.cluster` is "duplicated"). – fge Aug 29 '16 at 06:58
  • @fge Sorry, I mistyped. It is actually `kube-dns.kube-system.svc.cluster.local.` now. – Drux Aug 29 '16 at 07:45

1 Answers1

0

I had been using a VM instance without scope compute-rw for development so far (see here). I've now recreated it including that scope and rebuilt all relevant Docker images there. Apparently this has resolved the issue.

UPDATE There was also a second issue in that I had clusterIP: None as part of the service specification of postgres-service (now gone). It beats me why I was still able to ping postgres-service from another pod in the same cluster.

Community
  • 1
  • 1
Drux
  • 11,992
  • 13
  • 66
  • 116