0

Is there a way to calculate and generate a unique identifier for a given range of Int32 numbers? e.g. see the example please:

var range1 = [a, b, c, d];
var unique1 = GenerateUnique(range1);

var range2 = [b, a, d, c];
var unique2 = GenerateUnique(range2);

// unique1 and unique2 should be equal. I mean:
var areEqual = unique1 == unique2; // should be true


var range3 = [a, b, c, d];
var unique3 = GenerateUnique(range3);

// unique1 and unique3 should be equal. I mean:
areEqual = unique1 == unique3; // should be true


var range4 = [a, b, c, e]; // it has not d, but has e. different array.
var unique4 = GenerateUnique(range4);

// unique1 and unique4 should NOT be equal. I mean:
areEqual = unique1 == unique4; // should NOT be true / should be false


private int /* or even string */ GenerateUnique(int[] range) {
    // what would be the implementation of this method?
} 

UPDATE:

1- The ranges doesn't have any duplicate members. So there wouldn't be a [1, 2, 3, 4, 4] set. So, there is always [1, 2, 3, 4]

2- Array can be sorted. No problem.

3- The purpose: I have a huge number of items in DB. Sometimes, I need to generate a Word document of them. And, I don't want to regenerate a previously generated document. That's it. Ranges are modifiable, so I want to regenerate a document for a certain range, if and only if an item got added/removed to/from range. NOTE: Changing database and saving a range's last-change is not an option.

Community
  • 1
  • 1
amiry jd
  • 27,021
  • 30
  • 116
  • 215
  • You tickled my interest, what is the problem you are trying to solve when you achieve this unique identifier? What is the purpose? Also, should [1,2,3,4,4] result in the same hash as [1,2,3,4] ? – Michiel Cornille Apr 27 '15 at 15:19
  • Do you need to check for both: equality and non-equality? If you need to check only for non-equality, you can go with Janothon's approach, otherwise there might be some more logic needed. We've had some similar problem some time ago. I will check if I can grab the sources... – Stephen Reindl Apr 27 '15 at 15:23
  • Why not just sort and then concatenate? `[1,2,3,4] -> 1234` `[3,2,4,1] -> 1234`. Or am I missing something? I guess it depends on how large your ranges (and the numbers within them) get. – germi Apr 27 '15 at 15:25
  • @germi, because page 12 could be misinterpreted as page "1,2" but just ordering them without removing the comma should work .. – Michiel Cornille Apr 27 '15 at 15:31
  • @Mvision D'oh. I knew I was overlooking something. Thanks ;) – germi Apr 27 '15 at 15:37

1 Answers1

4

I think sorting the list of integers, and then using any standard hash function (MD5, SHA1, even CRC32 if you want) would work perfectly.

Jonathon Reinhart
  • 132,704
  • 33
  • 254
  • 328
  • hmm. There's a long discussion on http://stackoverflow.com/questions/4014090/is-it-safe-to-ignore-the-possibility-of-sha-collisions-in-practice about probabilities on hash functions. I assume that a check on the hash **and the contents** afterwards might be a bulletproof approach. – Stephen Reindl Apr 27 '15 at 15:20
  • Thanks to idea. Actually I have millions of ranges with a lot of items in them, so I can not create a unit test which help me to test all available possibilities. So, I cannot use `I think` approaches. I'm looking for a 100% working suggestion. Have you any idea please? – amiry jd Apr 27 '15 at 15:21
  • @Javad_Amiry *I think* you should code it up and try it. Obviously you can't unit test for every possible combination, that's absurd. Just test for the general case, and all corner cases you can come up with. Nothing you find on StackOverflow should *ever* be considered a *"100% working suggestion"*. – Jonathon Reinhart Apr 27 '15 at 15:26
  • @JonathonReinhart you are absolutly right. I'm not a good English talker, So... By `100%` working solution, I mean a solution which somebody else has used it before. I'm not sure how to tell :) – amiry jd Apr 27 '15 at 15:35
  • @TheodorosChatzigiannakis see my previous comment please. There is a misunderstanding :) – amiry jd Apr 27 '15 at 15:38