I have a file text with P random entries in Binary (or Hex) for processing, from that P number, I have to take N entries such that they are the most different possible between them so i have a good representative of the possible population.
So far, I have think of do a comparison between the current N, and a average of the array that contains the elements using a modified version of the algorithm in: How do I calculate similarity of two integers?
or having a cumulative score of similarity (the higher the most different) between the next element to be selected and all the elements in the array, and choose the next one, and repeat until have selected the required N
I do not know if there is a better solution to this.
Ex.
[00011111, 00101110, 11111111, 01001010 , 00011000, 10010000, 01110101]
P = 7 N = 3
Result: [00011111, 10010000, 00101110]
Thanks in advance