import pandas as pd
SETUP
I have a dataframe:
df = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3'],
'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3']},
index=[0, 1, 2, 3])
ie:
+----+-----+-----+-----+-----+
| | A | B | C | D |
|----+-----+-----+-----+-----|
| 0 | A0 | B0 | C0 | D0 |
| 1 | A1 | B1 | C1 | D1 |
| 2 | A2 | B2 | C2 | D2 |
| 3 | A3 | B3 | C3 | D3 |
+----+-----+-----+-----+-----+
(using print(tabulate(df, headers='keys', tablefmt='psql'))
, related Q)
PROBLEM
I would like to convert the above dataframe into this dict:
{'A0': ['A0', 'B0', 'C0', 'D0'],
'A1': ['A1', 'B1', 'C1', 'D1'],
'A2': ['A2', 'B2', 'C2', 'D2'],
'A3': ['A3', 'B3', 'C3', 'D3']}
The first element from each row are the keys, the rows of the dataframes are the values for the dict as lists.
SOLUTIONS
A
Using .iterrows()
, which seems bad practice:
`{row[1][0]: list(row[1]) for row in df.iterrows() for alias in row[1]}`
B
Using .groupby()
(and this):
gbdict=df.groupby('A').apply(lambda row: row.to_dict(orient='row')).to_dict()
{key: list(gbdict[key][0].values()) for key in gbdict.keys()}
They both produce the required output.
QUESTION
Is there a more efficient way to achieve the above goal?
If there could be a way without the for loop, ie the dict comprehension, that'd be great.