How could I run Spark Utils(exchange, algorithm, spark-connector) towards containerized NebulaGraph

Question

It seems that when the NebulaGraph cluster is deployed in containers(Docker-Compose or K8s), I cannot make Spark Connector read the Graph Data properly anyway.

The mitigation I made was to run Spark inside the container network:

for the docker-compose case, I put the spark env in same docker network
for k8s, I created a spark env inside the k8s cluster

While, this seems not always doable, especially since we'll have spark infra in production cases.

Could anyone explain me what exactly is needed to enable Spark outside of the container network working with NebulaGraph running in containers?

score 0 · Accepted Answer · answered Jan 05 '23 at 07:52

To run Spark Utils (exchange, algorithm, spark-connector) against a containerized NebulaGraph database, you will need to ensure that the Spark Utils can connect to the NebulaGraph cluster through the network.

If you want to run Spark outside of the container network and connect it to a NebulaGraph cluster running in containers, you will need to expose the NebulaGraph cluster to the external network and ensure that the Spark Utils can connect to it through the network. You can do this by specifying an external network in the networks field of the NebulaGraph cluster's Docker Compose configuration file, or by creating a service with an external load balancer in K8s.

You will also need to ensure that the necessary ports are open and that the Spark Utils have the correct connection details (e.g. hostname, port, username, password) to connect to the NebulaGraph cluster.

Thanks @randomv for the answer! I created gists to do so, too: https://gist.github.com/wey-gu/950e4f4c673badae375e59007d80d372 https://gist.github.com/wey-gu/699b9a2ef5dff5f0fb5f288d692ddfd5 — Wey Gu, Jan 05 '23 at 07:58

How could I run Spark Utils(exchange, algorithm, spark-connector) towards containerized NebulaGraph

1 Answers1