-1

I am building a data analysis tool and I need to store race odds in database. One of such odds type is called "trifecta" or "triplet" where you have to pick winning three participants in right order. As there can be up to 16 racers (but usually it is around 10) per race, this means that I have 16x15x14 different possible permutations. To store these, I made table with columns "race_id", "pos1", "pos2", "pos3" and "value" where I store only existing odds. Pos1 is the winner, pos2 is second place, pos3 is third place and value is the odds value that this particular permutation has.
This all works fairly well and I usually have about 500 permutations per race stored in db.

You still alive? Now comes the part that is causing me troubles. Occasionally I need to pull some data from db and it HAS TO BE in certain form (CSV). This is not up to me.
New format is "raceid", "perm_1_2_3", "perm_1_2_4", ... , "perm_1_2_16", "perm_1_3_2", "perm_1_3_4", ...
All columns have to exist in this new form even when I don't have them stored in db.

I am having troubles figuring out HOW should I do it. Any hints and tips are appreciated.

Here is a small sample of data in db:

raceid    pos1    pos2    pos3    value
201411    1       2       6       4643.6
201411    1       2       7       2321.8
201411    1       2       9       2321.8
201411    1       3       2       9287.2
201411    1       3       4       4643.6

Edit:
I need this as:

raceid    perm_1_2_3    perm_1_2_4    perm_1_2_5    perm_1_2_6    perm_1_2_7    perm_1_2_8    perm_1_2_9    perm_1_2_10    ...
201411    None          None          None          4643.6        2321.8        None          2321.8        None

This is a real problem and I don't expect anyone to do it for me. I am just hoping to get some hints or tips HOW should I solve this problem.

Zhack
  • 96
  • 7
  • If you have to pull some data from a CSV file, use the `csv` module. You'll need to pass the non-default `delimiter='\t'`, and you'll need to convert the values from strings to int/float/whatever explicitly, but other than that, it's all trivial. – abarnert Dec 02 '14 at 21:50
  • I have to join up data that is in database and output it as CSV. Writing CSV is not the complicated part here. I am having trouble formatting this data to "correct" form. – Zhack Dec 02 '14 at 21:53

2 Answers2

1

OK, so you have the input data in some form you haven't explained, and you want to generate columns perm_1_2_3 through perm_16_15_14.

One really easy way to do that is to use the permutations function to generate all the permutations; then it's just a matter of string formatting:

headers = ['raceid']
values = [raceid]
for perm in itertools.permutations(list(range(1, horses+1)), 3):
    headers.append('perm_{}_{}_{}'.format(*perm))
    values.append(odds.get(perm, None))

I'm assuming odds is a dictionary keyed on the tuple of horse numbers, maybe stored in a larger dict keyed by raceid, which you got something like:

for raceid, pos1, pos2, pos3, value in csv.reader(f):
    oddses[raceid][int(pos1), int(pos2), int(pos3)] = value
abarnert
  • 354,177
  • 51
  • 601
  • 671
  • Thanks you for this. Solved it in a matter of minutes and now it seems like a dumb question to ask. – Zhack Dec 02 '14 at 22:32
1

You can use something like these to generate those column names:

import itertools
for x in itertools.permutations([1,2,3]):
    print x

This work in python 2.6 and greater. For older versions you can look at this post: How to generate all permutations of a list in Python

Community
  • 1
  • 1
Turbo
  • 2,179
  • 18
  • 38