0

In a scenario when one java application instance can talk to multiple Gremlin servers (say multiple graph instances in CosmosDB). What is the recommendation for creating and caching Gremlin Clients in such a case. Looks like on my mac maximum number of Cluster instances I can create is less than 300. Beyond that I get "Too many files open exception".

caused by: java.io.IOException: Too many open files

java.lang.IllegalStateException: failed to create a child event loop

at io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:88)
at io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:58)
at io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:47)
at io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:59)
at io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:86)
at io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:81)
at io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:68)
at org.apache.tinkerpop.gremlin.driver.Cluster$Factory.<init>(Cluster.java:1065)

Is there a way around it? This is what is being done for many different values of graphUserName .

final Cluster cluster = Cluster.build()
                .addContactPoint("host")
                .port(443)
                .credentials(graphUserName,
                        "graph-password")
                .serializer("serializer)
                .enableSsl(true)
                .create();
        cluster.connect());
ab m
  • 422
  • 3
  • 17

1 Answers1

1

You should be creating one Cluster object for the life of your application. Assuming you are using sessionless requests, you would then also only create one Client instance from that Cluster and re-use it. There may of course be situations where a single Cluster object may not be sufficient. There are driver configuration options that can only apply to a Cluster object and not the Client instances it spawns. For example, if you have multiple authentication methods or are connecting to different servers, those settings are bound to the Cluster which would then require you to have several of those objects.

While you always want to take care in managing your settings for the driver, opening a large number of Cluster objects with default settings (which is where most people start) will likely trigger the error of caused by: java.io.IOException: Too many open files that you are seeing. Each Cluster object will open many network resources and unless you have made changes to your default settings in your OS you will likely exceed the file limit. This message is an operating system level issue and there are many resources on the internet for solving it.

stephen mallette
  • 45,298
  • 5
  • 67
  • 135
  • Thanks for your response Stephen. I understood the best practice. However, let’s say if a multi tenant application talks to multiple databases, we do need one Gremlin client per database, isn’t it? Is there a way around it? So say, if my app can talk to 100 databases, I’d need 100 Gremlin clients. – ab m Apr 23 '21 at 11:25
  • 1
    theoretically, i'd have to agree with that. if there are 100 different databases, you would likely need one `Client` instance for each depending on what you're doing. maybe you could get away with one `Client` for your whole application if you didn't need to close that `Client` per tenant - not sure what your requirements are really. Anyway, if you must do this approach then the error message is something you should google for. It's more of a network issue than a TinkerPop issue: https://stackoverflow.com/a/20408260/1831717 – stephen mallette Apr 23 '21 at 16:21
  • thank you again - this is helpful. my requirement is to support one application talking to multiple graph databases in the most efficient way possible. I do plan to keep a cache of clients created for each tenant and evict and close them on inactivity. I know I'll start hitting a wall at 200-300 such clients but wanted to check if there is any other efficient way I can try. – ab m Apr 23 '21 at 16:26
  • i'd say the `Cluster` is the most expensive piece to tear up/down so keeping one of those is best. `Client` is cheaper, but you would want to find the right connection pool size for each one of those. i believe it is the connection pool size that will be the thing that triggers your ulimit errors as each connection will be an open web socket connection – stephen mallette Apr 23 '21 at 16:40
  • Makes sense. I've been testing out different combinations to nioPoolSize on the Cluster. Will check the right connection pool size as well. Thanks for the pointers. – ab m Apr 23 '21 at 18:06
  • Looking at how a `Client` is created from a `Cluster`, it looks like for each Graph, we need to have a Cluster instance. Each Cluster has to have its own auth properties, so if we need to connect to different Graphs each with their own credentials, we need to have as many Cluster instances. Seems right? – ab m Apr 26 '21 at 16:21
  • 1
    yes - if you have different auth then you will need on `Cluster` per auth. – stephen mallette Apr 26 '21 at 17:06
  • Thank you. Can you please post your 2nd comment as an answer so I can approve it? The one where you say theoreticaly i need to create multiple clusters and clients – ab m Apr 27 '21 at 01:19