I originally asked a question in Data Science community:
I have a table formatted like the following table:
Feature amount ID Feat1 2 1 Feat2 0 1 Feat3 0 1 Feat4 1 1 Feat2 2 2 Feat4 0 2 Feat3 0 2 Feat6 1 2
Let's say I have 200 different IDs. I want to convert all different features into variables and amount into observations, so I combine rows with same ID into one row. For example,
Feat1 Feat2 Feat3 Feat4 Feat5 Feat6 ID 2 0 0 1 NA NA 1 NA 2 0 0 NA 1 2
Is there a good way to do it either in Python (pandas) or R?
And this is the answer I got:
newdata = pd.DataFrame(columns=['ID', 'Location', 'Feat1', 'Feat2', 'Feat3', 'Feat4', 'Feat5', 'Feat6']) grouped = data.groupby(['ID', 'Location']) for index, (group_name, d) in enumerate(grouped): newdata.loc[index, 'ID'] = group_name[0] newdata.loc[index, 'Location'] = group_name[1] for feature, amount in zip(d['Feature'], d['amount']): newdata.loc[index, feature] = amount
After more Google search, I found the answer of this question says:
So try to avoid the Python
loop for i, row in enumerate(...)
entirely
I am wondering, regarding to my original question, is there a more efficient way?