0

I'm looking for a way to make a transformation like below, the number of rows is the same for every unique id_, timestamp value is also the same for each unique id_. here is a portion from the original df

    timestamp id_  value
0  2020-05-01   a      0
1  2020-05-02   a      1
2  2020-05-03   a      2
3  2020-05-04   a      3
4  2020-05-05   a      4
5  2020-05-06   a      5
6  2020-05-07   a      6
7  2020-05-08   a      7
8  2020-05-09   a      8
9  2020-05-10   a      9
10 2020-05-01   b     10
11 2020-05-02   b     11
12 2020-05-03   b     12
13 2020-05-04   b     13
14 2020-05-05   b     14
15 2020-05-06   b     15
16 2020-05-07   b     16
17 2020-05-08   b     17
18 2020-05-09   b     18
19 2020-05-10   b     19

to

            a   b
2020-05-01  0  10
2020-05-02  1  11
2020-05-03  2  12
2020-05-04  3  13
2020-05-05  4  14
2020-05-06  5  15
2020-05-07  6  16
2020-05-08  7  17
2020-05-09  8  18
2020-05-10  9  19

My current way is as below

df_gb = df.groupby('id_')
ids = df_gb.size().index.tolist() 
tem_dict = dict()
index_candidate = df.timestamp.dt.date.unique().tolist()
for ts_id in ids:
    tem_dict[ts_id] = df_gb.get_group(ts_id).value.tolist()

res = pd.DataFrame(tem_dict).set_index(index_candidate)

I hope to use built-in functions in pandas since the performance reason. Thanks for your help.

ComplicatedPhenomenon
  • 4,055
  • 2
  • 18
  • 45

0 Answers0