0

I have a Pandas dataframe that looks like the following:

     fdc_id                                        ingredients  nutrient_id  amount
0    356425  ICE CREAM INGREDIENTS: MILK, CREAM, SUGAR, STR...         1079    0.00
5    356425  ICE CREAM INGREDIENTS: MILK, CREAM, SUGAR, STR...         1003    2.50
7    356425  ICE CREAM INGREDIENTS: MILK, CREAM, SUGAR, STR...         1005   35.00
9    356425  ICE CREAM INGREDIENTS: MILK, CREAM, SUGAR, STR...         2000   30.00
11   356425  ICE CREAM INGREDIENTS: MILK, CREAM, SUGAR, STR...         1253   25.00
15   356426  WATER, SUGAR, TOMATO PASTE, MOLASSES, DISTILLE...         1079    0.00
20   356426  WATER, SUGAR, TOMATO PASTE, MOLASSES, DISTILLE...         1253    0.00
22   356426  WATER, SUGAR, TOMATO PASTE, MOLASSES, DISTILLE...         1003    0.00
23   356426  WATER, SUGAR, TOMATO PASTE, MOLASSES, DISTILLE...         1005   43.24
25   356426  WATER, SUGAR, TOMATO PASTE, MOLASSES, DISTILLE...         2000   37.84
...

Using the above, I want to create a dataframe that looks like the following:

    fdc_id                                        ingredients nutrient_1079 nutrient_1003 nutrient_1005 ... ...
0   356425  ICE CREAM INGREDIENTS: MILK, CREAM, SUGAR, STR...          0.00          2.50         35.00 ... ...
1   356426  WATER, SUGAR, TOMATO PASTE, MOLASSES, DISTILLE...          0.00          0.00         43.24 ... ...

Basically, each nutrient_id should have its own column and the column values are equal to the amount for that particular fdc_id item. Also, some food items (fdc_id) don't have values (amount) for particular nutrients (nutrient_id) and the values should be 0.00 in such cases. How do I go about this?

Nikhil Hegde
  • 341
  • 1
  • 3
  • 15
  • 1
    `df.pivot_table(index = ['fcd_id', 'ingredients'], columns = 'nutrient_id', values = 'amount', fill_value = 0).add_prefix('nutrient').reset_index().rename_axis(columns=None)` – ansev Mar 31 '20 at 20:23
  • Thanks @ansev, that solved it. – Nikhil Hegde Mar 31 '20 at 20:42

0 Answers0