I am returning the ratings of features for a business by demographic and ratings conditions.
How would I sum the int64 values of the "count" column for all variables while including the variable names in the output e.g. Design 8
,Food 1
?
This question mentions coverting to index then selecting by index.
This question looks like a similar problem with SQL.
Currently, I can query by assigning the melt code-block to a variable prod
then write something like this.
prods[prods.rating == 2]
.
Example Data:
Customer Type Age Satisfaction Design Food Wi-Fi Service Distance
Disloyal 28 Not Satisfied 0 1 2 2 13.5
Loyal 30 Satisfied 5 3 5 4 34.2
Disloyal 36 Not Satisfied 2 0 2 4 55.8
# Cols I want to see the ratings for
ranked_cols = [
"Design",
"Food",
"Wi-Fi",
"Service",
]
# Select the relevant customers
sub = df[
(df["Customer Type"] == "Disloyal")
& (df["Satisfaction"] == "Not Satisfied")
& df["Age"].between(30, 40)
]
(
sub.melt(value_vars=ranked_cols)
.groupby("variable")
.value_counts()
.to_frame()
.reset_index()
.rename(columns={"value": "rating", 0: "count"})
)
[Out]
variable rating count
0 Design 2 5
1 Food 0 1
2 Service 4 1
3 Wi-Fi 2 3
4 Design 1 3