2

I am working on a project in which I need to use Cassandra Database. I have a sample program that will populate data into Cassandra database. I am using Pelops client for that.

So now I am thinking of making a Singleton class for Cassandra database that will make a connection to Cassandra database and then I be using that instance from Singelton class into my CassandraDAO to insert into Cassandra database and retrieve the data from Cassandra database as well.

Below is my Singleton class that I have built so far which will make a connection to Cassandra database-

public class CassandraConnection {

    private static CassandraConnection _instance;
    private String keyspace;
    private String[] seeds;
    private int port;
    private String poolName;

    public static synchronized CassandraConnection getInstance() {
        if (_instance == null) {
            _instance = new CassandraConnection();
        }
        return _instance;
    }

    private CassandraConnection() {
        setKeyspace(ICassandraDo.KEYSPACE_NAME);
        setSeeds(ICassandraDo.NODES).split(",");
        setPort(ICassandraDo.CASSANDRA_PORT);
        setPoolName(ICassandraDo.THRIFT_CONNECTION_POOL);

        createPool();
    }

    //This is the right way to `addPool` in pelops?
    private void createPool() {
        Pelops.addPool(getPoolName(), getSeeds(), getPort(),
                false, getKeyspace(), new Policy());

    }

    private String setSeeds(String nodes) {

    // I am not sure what I am supposed to do here? 
    // Any guidance will be of great help

    }

    private void setPoolName(String thriftConnectionPool) {
        this.poolName = thriftConnectionPool;
    }

    private void setPort(int cassandraPort) {
        this.port = cassandraPort;
    }

    private void setKeyspace(String keyspaceName) {
        this.keyspace = keyspaceName;

    }

    public void setSeeds(String[] seeds) {
        this.seeds = seeds;
    }

    public String[] getSeeds() {
        return seeds;
    }

    public int getPort() {
        return port;
    }

    public String getKeyspace() {
        return keyspace;
    }

    public String getPoolName() {
        return poolName;
    }
}

Problem Statement:-

I have few doubts in my above code.

  1. Firstly, what I am supposed to do in setSeeds method in my above class? Any pointers or example will be of great help.
  2. Secondly, I am not sure whether this is the right way to do this as I am creating a Singleton class? I am wondering what's the best approach is for managing a cluster connection with pelops client.
  3. And also, what's the best way of using addPool method in my above code? I guess, I messed up something over there as well? As I keep on seeing different addPool methods in Pelops class? So which method I should be using keeping in mind as I will be running this in Production environment.

And after the above Singleton class is ready, I am planning to use the above class in my DAO code, something like this-

Mutator mutator = Pelops.createMutator(CassandraConnection.getInstance().getPoolName()); mutator.writeColumns(other data inside);

And then do the selector as well for retrieving the data.

Just FYI, I am working with Cassandra 1.2.3 and Scale 7 pelops client.

Any help will be appreciated. Thanks in advance.

Updated Code:-

Below is my updated code.

public class CassandraConnection {

    private static CassandraConnection _instance;
    private String keyspace;
    private String[] nodes;
    private int port;
    private String poolName;


    public static synchronized CassandraConnection getInstance() {
        if (_instance == null) {
            _instance = new CassandraConnection();
        }
        return _instance;
    }

    private CassandraConnection() {
        setKeyspace(ICassandraDo.KEYSPACE_NAME);
        setNodes(ICassandraDo.NODES);
        setPort(ICassandraDo.CASSANDRA_PORT);
        setPoolName(ICassandraDo.THRIFT_CONNECTION_POOL);

        createPool();
    }


    private void createPool() {
        Pelops.addPool(getPoolName(), getCluster(), getKeyspace());

    }

    private Cluster getCluster() {

        Config casconf = new Config(ICassandraDo.CASSANDRA_PORT, true, 0); 

        Cluster cluster= new Cluster(nodes, casconf, ICassandraDo.NODE_DISCOVERY);

        return cluster; 
    }


    private void setPoolName(String thriftConnectionPool) {
        this.poolName = thriftConnectionPool;
    }

    private void setPort(int cassandraPort) {
        this.port = cassandraPort;
    }

    private void setKeyspace(String keyspaceName) {
        this.keyspace = keyspaceName;

    }

    private void setNodes(String nodes) {
        this.nodes = nodes.split(",");
    }

    public int getPort() {
        return port;
    }

    public String getKeyspace() {
        return keyspace;
    }

    public String getPoolName() {
        return poolName;
    }
}

Just FYI, In my case, I am going to have two clusters each with 12 nodes.

Can anyone take a look and let me know I got everything correctly? Thanks for the help.

arsenal
  • 23,366
  • 85
  • 225
  • 331

1 Answers1

1

Seeds nodes are two (or more, but 2 is the suggested number from Cassandra documentation) nodes of your cluster. In each cassandra-node configuration file (cassandra.yaml) there is the address of the seeds nodes for the cluster. Imagine you have cluster of 5 nodes

192.168.1.100 192.168.1.101 192.168.1.102 192.168.1.103 192.168.1.104

in each config file there will be, for instance

Seeds 192.168.1.101 192.168.1.103

For this cluster these 2 addresses are the seed nodes. Each node of the cluster at the startup will contact these 2 nodes and get the necessary information. In your example you can pass the addresses found in the configuration or just couple of address nodes of the cluster

String[] nodes = new String[2];
nodes[1] = "192.168.1.101";
nodes[2] = "192.168.1.103";

2) The Singleton is absolutely unnecessary since the Pelops class is made only by static elements. If you have an Init/Startup in your application just declare there the connection to Cassandra and it will be available in all your code

3) There is no correct answer, the right way to connect to a cluster depends on the cluster. You may need to set your custom parameters or leave the one by Pelops. In my production env (5 nodes, RF=3) I'm using default params without problems.

Ciao

Carlo Bertuccini
  • 19,615
  • 3
  • 28
  • 39
  • Thanks Carlo for the suggestion. Some things makes sense to me. Firstly, in my current example, what lines I am supposed to add in my `setSeeds` method? I believe we need to add a Cluster line over there? Right? Can you provide an example what I should be adding there just to get better idea on that. Secondly, take a look into my `createPool method`, I believe I have messed up something over there. I cannot find any addPool mehtod with my signature. Can you provide me an example for that as well whatever you are using in your production environment as well to understand more better. – arsenal Apr 09 '13 at 21:56
  • I updated my question with the latest code, Can you take a look and let me know if everything looks good? – arsenal Apr 10 '13 at 00:05
  • You changed your setseeds to setnodes:correct. In my prod-env I have one cluster but nothing should change from the dev point of view. Here is the code I use to connect `String[] nodes = cfg.getStringArray("cassandra.servers"); int port = cfg.getInt("cassandra.port"); boolean dynamicND = true; // dynamic node discovery Config casconf = new Config(port, true, 0); Cluster cluster = new Cluster(nodes, casconf, dynamicND); Pelops.addPool(Const.CASSANDRA_POOL, cluster, Const.CASSANDRA_KS);` if you have two clusters I think the best way is to set 4 nodes to Pelops, 2 from each cluster. HTH, Carlo – Carlo Bertuccini Apr 10 '13 at 06:36
  • Thanks Carlo. I will be having two cluster each with 12 nodes. You said, I should set 4 nodes to Pelops, two from each cluster. right? On what basis you decided I should set 4 nodes to Pelops? I was in the impression, if I have 24 nodes then I should add all 24 nodes in pelops? Is this not true? Correct me if my understanding is wrong? – arsenal Apr 11 '13 at 05:41
  • Any thoughts? I am in the learning process so any guidance will be of great help. – arsenal Apr 11 '13 at 17:47
  • From Pelops docs: _To create a pool, you need to specify a name, a list of known contact nodes (**the library can automatically detect further nodes in the cluster, but see notes at the end**)_ -- so you don't need to provide the complete list. From cassandra docs:_Cassandra nodes exchange information about one another using a mechanism called Gossip, but to get the ball rolling a newly started node needs to know of at least one other, this is called a Seed. It's customary to pick a **small number of relatively stable nodes to serve as your seeds**!_ That's where my "tip" come from – Carlo Bertuccini Apr 12 '13 at 06:45