0

I am trying to figure out the best way of creating tuples with the format: (x:y) from 2 columns in a dataframe and then use column a of the dataframe as the key of the tuple

   key     data_1  data_2
0  14303  24.75   25.03 
1  12009  25.00   25.07 
2  14303  24.99   25.15 
3  12009   24.62   24.77 

The resulting dictionary {14303 24.38:24.61 24:99:25:15 12009 24.62:24.77 25.00:25.07 }

I have tried to use iterrows and enumerate but was wondering if there is a more efficient way to achieve it

E B
  • 1,073
  • 3
  • 23
  • 36

1 Answers1

1

I think you wanted to append the (data_1, data2) tuple as a value for the given key. This solution uses iterrows(), which I acknowledge you said you already use. If this is not what you are looking for please post your code and exactly the output you want. I don't know if there is a native method in pandas to do this.

# df is the dataframe
from collections import defaultdict
sample_dict = defaultdict(list)
for line in df.iterrows():
    k = line[1][0]  # key
    d_tuple = (line[1][1], line[1][2]) # (data_1, data_2)
    sample_dict[k].append(d_tuple)

sample_list is therefore:

defaultdict(list,
        {12009.0: [(25.0, 25.07), (24.620000000000001, 24.77)],
         14303.0: [(24.75, 25.030000000000001),
          (24.989999999999998, 25.149999999999999)]})

sample_list[12009] is therefore:

[(25.0, 25.07), (24.620000000000001, 24.77)]

Update: You might take a look at this thread too: https://stackoverflow.com/a/24368660/4938264

Community
  • 1
  • 1
Mark
  • 309
  • 1
  • 9