I have a pandas dataframe, which looks like this:
Country City POI Type
0 NL Amsterdam KFC restaurant
1 NL Amsterdam KFC cafe
2 NL Arnhem McDonalds fast food
3 NL Arnhem McDonalds ice cream
I need to group by type column so I do not have duplicates in all other columns. In other words, I need an output like this:
Country City POI Type
0 NL Amsterdam KFC restaurant, cafe
1 NL Arnhem McDonalds fast food, ice cream
I tried to use group by function, but all column names disappear, and shape function shows 0 columns. Maybe there is a better way to group those values?
Here is a sample code:
import pandas as pd
import numpy as np
data = np.array([['','Country','City', 'POI', 'Type'],
[0,"NL","Amsterdam", 'KFC', 'cafe'],
[1,"NL","Amsterdam", 'KFC', 'restaurant'],
[2,"NL","Arnhem", 'McDonalds', 'fast-food'],
[3,"NL","Arnhem", 'McDonalds', 'ice cream']]
)
initial_df = pd.DataFrame(data=data[1:,1:],
index=data[1:,0],
columns=data[0,1:])
final_df = initial_df .groupby( [ "Country", "City", "POI", "Type"] ).count()
print(list(final_df.columns.values))
print(final_df.shape)