3

I have an application where i generate unique combinations of numbers like where each combination is unique.I want to store the combinations in a way that i should be able to retrieve a few random combinations efficently later.

Here is an example of one combination

35 36 37 38 39 40 50

There are ~4 millions of combinations in all.

How should I store the data so that i can retrieve combinations later?

user2650277
  • 6,289
  • 17
  • 63
  • 132
  • 1
    On what basis you retrieve? – Azodious Dec 14 '16 at 10:46
  • @Azodious i just retrieve randomly but i don't want to retrieve a combination twice per execution – user2650277 Dec 14 '16 at 10:47
  • So you have around 4mln combination in database and want to retrieve a relatively small number of distinct combinations for processing? – lexicore Dec 14 '16 at 10:50
  • Why don't you make it a list of lists of integers? Then to get a random combination you generate a random integer between 0 and `list_length - 1`: This is your index. – haffla Dec 14 '16 at 10:52
  • @lexicore I think a self contained db like sqlite would do but i am not sure what datatype to use or what the db strcuture should be – user2650277 Dec 14 '16 at 10:55
  • You can make 40 chunks of 100000 combinations each. And pick a different combination each time from 1st chunk for 10000 times. thereafter replace chunk and pick 10000 times. and so on.... – Azodious Dec 14 '16 at 10:57
  • does the order matter , like 35 36 37 38 39 40 50 is not equal to 35 36 37 38 39 50 40 ?? – Juliyanage Silva Dec 14 '16 at 11:03
  • @Shamitha Silva order does not matter..i don't want duplicates here – user2650277 Dec 14 '16 at 11:10
  • @user2650277 Do you select numbers from a certain set? Do all combinations have the same length? – lexicore Dec 14 '16 at 11:12
  • @lexicore yes to both – user2650277 Dec 14 '16 at 11:19
  • @user2650277 Then I'd generate the sequence from some `long index` instead of storing. There are most probably algorithms for this which would allow you to reliably generate unique sequence for some index. You will not need to store sequence numbers at all then. – lexicore Dec 14 '16 at 11:23

3 Answers3

3

Since your combinations are unique and you actually don't have a query criteria on your numbers, it does not matter how you store them in the database. Just insert them in some table. To retrieve X random combinations simply do:

SELECT * FROM table ORDER BY RANDOM() LIMIT X

See:

Select random row(s) in SQLite

On storing array of integers in SQLite:

Insert a table of integers - int[] - into SQLite database,

Community
  • 1
  • 1
lexicore
  • 42,748
  • 17
  • 132
  • 221
  • Should i store each combination as a json with a VARCHAR as datatype – user2650277 Dec 14 '16 at 11:01
  • @user2650277 This depends on what you actually want to do with your combinations in queries. If you don't do anything then I'd take the simplest possible approach and just store them as a comma-separated string. – lexicore Dec 14 '16 at 11:06
2

I think there might be a different solution; in the sense of: do you really have to store all those combinations?!

Assuming that those combinations are just "random" - you could be using some (smart) maths, to some function getCombinationFor(), like

public List<Integer> getCombinationFor(long whatever)

that uses a fixed algorithm to create a unique result for each incoming input.

Like:

getCombinationFor(0): gives 0 1 2 3 10 20 30

getCombinationFor(1): gives 1 2 3 4 10 20 30 40

The above is of course pretty simple; and depending on your requirements towards those sequences you might require something much complicated. But: for sure, you can define such a function to return a permutation of a fixed set of numbers within a certain range!

The important thing is: this function returns a unique List for each and any input; and also important: given a certain sequence, you can immediately determine the number that was used to create that sequence.

So instead of generating a huge set of data containing unique sequences, you simply define an algorithm that knows how to create unique sequences in a deterministic way. If that would work for you, it completely frees you from storing your sequences at all!

Edit: just remembered that I was looking into something kinda "close" to this question/answer ... see here!

Community
  • 1
  • 1
GhostCat
  • 137,827
  • 25
  • 176
  • 248
  • In my use case i need to generate a combination of numbers within a range... think lottery numbers.So if the user wants to generate 100s of combination , there might be duplicate – user2650277 Dec 14 '16 at 11:06
  • Well, it all depends on the amount of "math" you intend to spend. I am sure that one can define some f(n) that creates a random list of values based on n. In the end you have a **finite** set of permutations. You can **construct** all those permutations! – GhostCat Dec 14 '16 at 11:19
  • I also updated my answer with a link to an older question of mine ... exactly around that topic. There was also a guy who thought he would have to save GBs of data ... until told that he can use some smart math and avoid all of that. – GhostCat Dec 14 '16 at 11:21
  • @user2650277 And, if i may ask: if you find my input slightly helpful, please consider upvoting my linked question about that topic. It is such a pain to see that -1 on that ;-) – GhostCat Dec 14 '16 at 11:26
0

Sending data:

Use UNIQUE index. On Unique Violation Exception just rerandomize and send next record.

Retrive:

The simplest possible to implement is: Use HashSet, feed it whith random numbers (max is number of combinations in your database -1) while hashSet.size() is less then desired number of records to retrive. Use numbers from hashSet as ID's or rownums of records (combinations) to select the data: WHERE ID in (ids here).