1

I have a dataframe having data of cities having different product types, such as :

city product_type
A B
A B
A D
A E
X B
X C
X C
X C

I want to know what the most common product type is, for each city. For the above df, it would be product B for city A and product C for city X.

I am trying to solve this by first grouping then iterating over the groups and trying to find the product type with max occurrence but it doesn't seem to work:

d = df.groupby('city')['product_type']
prods=[]

for name,group in d:
    
    l = [group]
    
    prod = max(l, key=l.count)
    
    prods.append(prod)

print(prods)# this is list of products with highest occurrence in each city

This piece of code seems to give me ALL the product types, not just the most frequent ones.

Equinox
  • 6,483
  • 3
  • 23
  • 32

1 Answers1

1

You can try something like this:

data = pd.DataFrame({
    'city': ['A', 'A', 'A', 'A', 'X', 'X', 'X', 'X'],
    'product_type': ['B', 'B', 'D', 'E', 'B', 'C', 'C', 'C']
})

result_dict = {city: city_data.product_type.value_counts().index[0] 
    for city, city_data in data.groupby('city')}
print(result_dict)

This will result in dictionary: {'A': 'B', 'X': 'C'}. Note that if more than one product has the same number of occurrences this code will only return one of them.

Kira
  • 26
  • 1