13

If I am not wrong, one can connect to a Cassandra cluster knowing at least one of the nodes that is in the cluster, and then the others can be discovered.

Lets say I have three nodes (1, 2 and 3) and I connect to those nodes like this:

Cluster.builder().addContactPoints("1,2,3".split(",")).build();

Then, if node 3 for example goes down, and the IP cannot be resolved, this line of code will throw an IllegalArgumentException as stated in the docs:

@throws IllegalArgumentException if no IP address for at least one of {@code addresses} could be found

Why would anyone want this behavior? I mean, if one of the nodes is down, I want the app to be able to run, as the Cassandra is still working fine.

I have checked this Cassandra Java driver: how many contact points is reasonable? but that does not answer my question as it doesn't say anything about hosts than can't be reachable.

How should I handle this? Maybe this is changed in another version of the java driver? I am currently using cassandra-driver-core-3.0.3

Community
  • 1
  • 1
Pablo Matias Gomez
  • 6,614
  • 7
  • 38
  • 72

3 Answers3

12

This validation is only to make sure that all the provided hosts can be resolved, it doesn't even check if a Cassandra server is running on each host. So it is basically to ensure that you did not do any typos while providing the hosts as indeed it doesn't assume that it could be a normal use case to have a provided host that cannot be resolved.

As workaround in your case (host been removed from the DNS entries), you could simply call the method addContactPoint(String address) explicitly instead of using addContactPoints(String... addresses) (which behind the scene simply call addContactPoint(String address) for each provided address) and manage the exception by yourself.

The code could be something like this:

Cluster.Builder builder = Cluster.builder();
// Boolean used to check if at least one host could be resolved
boolean found = false;
for (String address : "1,2,3".split(",")) {
    try {
        builder.addContactPoint(address);
        // One host could be resolved
        found = true;
    } catch (IllegalArgumentException e) {
        // This host could not be resolved so we log a message and keep going
        Log.log(
            Level.WARNING, 
            String.format("The host '%s' is unknown so it will be ignored", address)
        );
    }
}
if (!found) {
    // No host could be resolved so we throw an exception
    throw new IllegalStateException("All provided hosts are unknown");
}
Cluster cluster = builder.build();

FYI: I've just created a ticket to propose an improvement in the Java driver https://datastax-oss.atlassian.net/browse/JAVA-1334.

Nicolas Filotto
  • 43,537
  • 11
  • 94
  • 122
3

As Nick mentioned, it's based on DNS resolution, not Cassandra server health.

If you remove hosts from your environment more often than you recompile your application, then you should consider not baking your contact points into the code, and instead, feed them in through some other means (environment variable, REST service, a single DNS name that always resolves to one live seed, etc).

Jeff Jirsa
  • 4,391
  • 11
  • 24
  • Our environment really does change more often than our code, we have a private cloud and our VMs can change dynamically. You have a point that we could try to implement a workaround from our end, but I also think this validation could be fixed, as the cluster would still work in the state Pablo describes, but the app won't start up. – juan Nov 15 '16 at 21:01
  • If you don't want to implement a workaround I would recommend filing a an issue with the java driver https://datastax-oss.atlassian.net/projects/JAVA/summary. I think the behavior of only requiring one hostname to be resolvable and discarding any that aren't with a warning is probably reasonable. In any case I think you have the answer to your question. Stack overflow bounties aren't really meant for getting code changes, but I'd definitely recommend submitting a patch along with a ticket to get a fix released sooner. – nickmbailey Nov 15 '16 at 21:15
0

The documentation there is just in regards to "resolving" the contact points that are passed in. So converting hostnames to ip addresses. If you are specifying ip addresses to begin with, they will not be resolved, simply checked for validity. If you are using hostnames then each contact point will need to be resolvable. This doesn't mean that the cassandra machine needs to be running, just that a DNS lookup on the hostname returns any ip address. So the case where things would break would be if you removed a DNS entry for one of your contact points and restarted your application.

nickmbailey
  • 3,674
  • 15
  • 14
  • Yes I know, I quoted that. I am looking for a solution in which if a node is not reachable, the code works. – Pablo Matias Gomez Sep 28 '16 at 21:39
  • I'm saying that it already accomplishes that. An exception is only thrown if *all* of the contact points specified are not reachable. If a subset of contact points are not reachable things will still work. – nickmbailey Sep 28 '16 at 21:46
  • No, I think you are wrong. The text says clearly "at least one of" which means that if no IP address if found for ONE, then it throws exception. I checked the code and its like this. Also tested writing "asdasd" as one more ip and it crashes. – Pablo Matias Gomez Sep 29 '16 at 00:03
  • You should check that the phrase says "if **no** IP address ... **could be found**." – Pablo Matias Gomez Sep 29 '16 at 00:04
  • Yeah that doc message is just confusing, I'll see if that can be updated. In any case, that's just referring to whether or not the host names passed in are resolvable, not reachable. So passing in valid hostnames or ips, even if Cassandra isn't running will still work fine. – nickmbailey Sep 29 '16 at 15:00
  • Yes, I understand that, that's why I am asking about hosts that cannot be resolved (because ips do not exist anymore). If I have a configuration of 3 hosts (1,2,3) and number 3 is down (ip no longer exists) then I wont be able to start my app, and I dont want that. I am clear now? Maybe I didn't explain this very well in my question – Pablo Matias Gomez Sep 29 '16 at 15:10
  • Yes I understand now. If you use actual ip addresses things will work. An ip address string isn't 'resolved' it is just validated to be valid. You can test it out by putting a fake ip in like "1.1.1.1". The reason your example of "asdasd" didn't work is it isn't an ip address but a hostname and it isn't resolvable. So if you are using hostnames for your cassandra nodes and remove the DNS entries for some of those hostnames then yes the driver will fail to start up. I don't think that is a common case but you could definitely file a ticket against the java driver. I edited my answer as such. – nickmbailey Sep 29 '16 at 19:34
  • Yes, that is what I was trying to say. I was previously using hector as a connector to cassandra and this wasn't a problem. In our current infrastructure, it is common to have a machine removed, and thus it DNS removed. – Pablo Matias Gomez Sep 29 '16 at 20:40