0

I have a dataset with multiple columns but there is one column named 'City' and inside 'City' we have multiple (city names) and another column named as 'Complaint type' and having multiple types of complaints inside this, and i have to convert the all unique cities into columns and all unique complaint types as rows.

and there is a 'Unique Key' variable which is assigned to each complaint.

Please help me with the proper codes.

df_new=df.pivot_table(index='Complaint Type',columns='City',values='Unique Key')
df_new

I did this and worked but is there any other way to do it as it is not clear to me

toyota Supra
  • 3,181
  • 4
  • 15
  • 19
MEGHA
  • 11
  • 1
  • Please post sample data so that we can provide possible alternatives. If pivot works, then I wouldn't worry about using another method unless it's purely academic. – Stu Sztukowski Apr 12 '23 at 14:09
  • so this should be ok to go right and yes it is giving me the result so ok i got it – MEGHA Apr 12 '23 at 17:44
  • If it's performant and gives you the right results? Yep, you solved the problem and you don't need to do anything further with it. Time to move on to the next piece of code. But if you don't understand why it worked, I encourage you to study it and figure out what it's doing. – Stu Sztukowski Apr 12 '23 at 17:47

1 Answers1

0

I think Unique Key column is not relevant here. Use pd.crosstab:

df_new = pd.crosstab(df['Complaint Type'], df['City'])
print(df_new)

# Output
City            V  W  X  Y  Z
Complaint Type               
A               0  1  0  0  4
B               5  2  2  3  1
C               0  3  1  0  1
D               2  2  1  2  3
E               0  1  1  1  4
F               2  1  1  0  2
G               0  1  0  0  3

If you prefer to use pivot_table, you have to use an aggregation function since you have no numeric column:

df_new = df.pivot_table(index='Complaint Type', columns='City', aggfunc='size')
print(df_new)

# Output
City            V  W  X  Y  Z
Complaint Type               
A               0  1  0  0  4
B               5  2  2  3  1
C               0  3  1  0  1
D               2  2  1  2  3
E               0  1  1  1  4
F               2  1  1  0  2
G               0  1  0  0  3

You should also read How can I pivot a dataframe?

Minimal Reproducible Example:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

rng = np.random.default_rng(2023)

df = pd.DataFrame({'Complaint Type': rng.choice(list('ABCDEFG'), 50),
                   'City': rng.choice(list('VWXYZ'), 50)})
Corralien
  • 109,409
  • 8
  • 28
  • 52