0

How do I convert a Pandas dataframe from a 'frequency table' format to a flat dataframe format and back again using idiomatic Python?

From:

        H     E     K
0       B     B    12
1       B     G     3
2       G     B    17
3       G     G    68

to:

        H     E
0       B     B
1       B     B
2       B     B
3       B     B
4       B     B
5       B     B
6       B     B
7       B     B
8       B     B
9       B     B
10      B     B
11      B     B
12      B     G
13      B     G
14      B     G
...

and back again!

        H     E     K
0       B     B    12
1       B     G     3
2       G     B    17
3       G     G    68

Please advise?

matekus
  • 778
  • 3
  • 14
  • Scale up `new_df = df.loc[df.index.repeat(df['K'])].reset_index(drop=True)` like [this answer](https://stackoverflow.com/a/26778637/15497888) – Henry Ecker Feb 25 '22 at 20:39
  • Scale back down `df = new_df.groupby(['H', 'E']).size().reset_index(name='K')` like [this answer](https://stackoverflow.com/a/32801170/15497888). – Henry Ecker Feb 25 '22 at 20:39
  • @henry-dcker, Thanks for the benefit of your expertise. Can I drop the 'K' column as part of the conversion? – matekus Feb 25 '22 at 20:46
  • Yeah. Just `drop` the column `new_df = df.loc[df.index.repeat(df['K'])].drop(columns='K').reset_index(drop=True)` – Henry Ecker Feb 25 '22 at 20:49
  • @henry-ecker, When I dump 'new_df' to a csv file, there are 11 'B-B' rows, 2 'B-G' rows and so on instead of 12, 3, 17, and 68 respectively? – matekus Feb 25 '22 at 20:59
  • @henry-ecker. I can confirm that the conversion to 'new_df' loses one row for each of the four combinations! In other words [12,3,17,68] becomes [11,2,16,67] on return to the original format? – matekus Feb 25 '22 at 21:15
  • I can't reproduce that behaviour. `repeat` can only repeat the number of times specified in column K no more no less. I get B B in indexes 0-11 (which is 12 rows including index 0). On return to original form I get `[12, 3, 17, 68]` – Henry Ecker Feb 25 '22 at 21:39
  • @henry-ecker. Sincere apologies - transcription error on my part. How do I give you credit for an excellent answer? – matekus Feb 26 '22 at 08:05

0 Answers0