1

To understand the ring without vNodes, I tried initial token in Node 1 as 25 and Node 2 as 50 like below,

Address       Rack        Status State   Load            Owns                Token                                       
                                                                             50                                          
172.30.56.60  rack1       Up     Normal  82.08 KiB       100.00%             25                                          
172.30.56.61  rack1       Up     Normal  82.09 KiB       100.00%             50  

I expect only the partition ranges between 0 to 50 should be added in database, But It is allowing any primary key / partition key value I provide as follows (user_id - primary / partition key).

 user_id    | user_name | user_phone
------------+-----------+------------
  999933333 |       ram | 9003934069
        111 |       ram | 9003934069
          1 |       ram | 9003934069
  111333333 |       ram | 9003934069
 1113333333 |       ram | 9003934069

where, user_id is the primary / partition key.

Does it mean that token provided in initial_token is the total number of tokens and not the partition range? If so how the partition range is calculated?

Thanks, Harry

Harry
  • 3,072
  • 6
  • 43
  • 100
  • Cassandra will perform hashing(MurMur3 by default) on partition key and based on hash value particular node is selected.... NOTE it's hash of partition key and not actual value – undefined_variable Dec 04 '17 at 09:04
  • You should try to do a 'nodetool ring' for better understanding. – Simon Fontana Oscarsson Dec 04 '17 at 09:32
  • FYI, the above output is from nodetool ring, Also I want to know we have hash values from -2 power 63 to +2 power 63, Does it mean that the 4 values inserted above lies between partition range of 0 to 50 ? – Harry Dec 04 '17 at 09:33
  • @SimonFontanaOscarsson Also could you please respond to this https://stackoverflow.com/questions/47628499/data-re-partioning-in-cassandra – Harry Dec 04 '17 at 09:36

1 Answers1

2

The token number is a hash of the partition key. This decides where the data should be stored.

(ref: https://www.datastax.com/dev/blog/repair-in-cassandra): enter image description here

In this picture N0 is assigned token 0, N1 token 10 and so on. By doing this we say N1 is responsible for token ranges 1-10. However if we use RF 3 then we say N1 is responsible for token ranges 81-10 instead. What you have done in your example is saying 60 owns 51-25. Since there is still a total of 2^127 tokens (depending on your partitioner) that means it now owns a huge amount of data compared to 61.

Simon Fontana Oscarsson
  • 2,114
  • 1
  • 17
  • 20
  • I got this point clearly, If the number of nodes are 3 and RF = 3, then all data will be there in all nodes. My question is in the above case, Token range is from 0 to 50, Do you really think that the values of '999933333', '1', '111333333' all lies within this token range? This is my question, I have configured the token range as 0 to 50 (which I assume its the hash value range), how is it possible to store all the values (which I believe that it is out of token range)? – Harry Dec 05 '17 at 06:18
  • I mean the murmur hash value of these '999933333', '1', '111333333' will not come under the token range 0 to 50, How it is possible for Cassandra to store the entry? Is there a tool in online to calculate the hash value done? @SimonFontanaOscarsson – Harry Dec 05 '17 at 07:59
  • also could you answer this question : https://stackoverflow.com/questions/47649172/data-structure-in-cassandra – Harry Dec 05 '17 at 08:43
  • @Harry I am sure. I just ran a stress test to prove it. You can do it yourself with and without your configuration to see it yourself. https://pastebin.com/utDar3LP – Simon Fontana Oscarsson Dec 05 '17 at 09:31
  • 1
    @Harry I don't understand your questions. The nodes always owns all tokens from the closest previous token. In your case for node 60 its token range starts with 51 since the closest previous set token is 50 for node 61. Node 60s token range will therefor go from 51-25. Think of it as a circle. – Simon Fontana Oscarsson Dec 05 '17 at 09:37
  • to make the question simple : http://murmurhash.shorelabs.com/ This is the online murmur hash conversion, I used the value '999933333' to convert the hash, Hash value is received as '571110711' so my point is how come cassandra allow the node to store this (999933333) data in the node 60 and 61 which has to accept only the hash values between 0 to 50 – Harry Dec 05 '17 at 09:55
  • @Harry You're still stuck on that you think your cluster can only store data with tokens 0-50. Can you please read all of my answers again. – Simon Fontana Oscarsson Dec 05 '17 at 11:38
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/160509/discussion-between-harry-and-simon-fontana-oscarsson). – Harry Dec 05 '17 at 12:03