From my analysis I have discovered that Disloyal
30-40
year old customers are Not Satisfied
with Company X. "Not Satisfied" means they have rated services and products 0-2 out of a possible 5. I want to know what inputs were ranked <=2.
I stored the columns in a list to use in a for loop so I could index the relevant column values which are rankings 0-5.
What is the syntax for using the column
variable in the boolean expression?
Example Data:
Customer Type Age Satisfaction Design Food Wi-Fi Service Distance
Disloyal 28 Not Satisfied 0 1 2 2 13.5
Loyal 30 Satisfied 5 3 5 4 34.2
Disloyal 36 Not Satisfied 2 0 2 4 55.8
Code
ranked_cols = ['Design', 'Food', 'Wi-Fi', 'Service', 'Distance']
for column in df[ranked_cols]:
columnSeriesObj = df[column]
sub = df[
(df["Customer Type"] == "Disloyal")
& (df["Satisfaction"] == "Not Satisfied")
& df["Age"].between(30, 40)
]
sub[(sub[ranked_cols] <= 2)].shape[0]
(sub.melt(value_vars=[c for c in sub.columns if c.startswith(column)])
.groupby("variable")
.value_counts()
.to_frame()
.reset_index()
.rename(columns={0: "count"}))