0

I have data frame like(This dataframe may have more data with different Candidate):

Candidate Name    Subject    Keyword  Count
   User1         optional    devops      1
   User1         mandatory      aws      0
   User1         mandatory      ec2      1
   User1         optional    python      1
   User1         optional      java      1
   User1         mandatory   dotnet      0
   User2         optional    devops      1
   User2         mandatory      aws      1
   User2         mandatory      ec2      0
   User3         optional    devops      1
   User3         mandatory      ec2      1
   User3         mandatory      aws      0
   User3         optional      java      1

How to convert the above data to:

                      mandatory            optional             
                    aws dotnet ec2     devops java python              
Candidate Name                                                        
User 1               1    1     1        1     1      1           
User 2               0    1     0        1     1      1           
...
User N               1    1     1        0     0      0    

I tried but it's not working:

read_csv = pd.read_csv('Sample.csv')
df = pd.DataFrame(read_csv)
df = df.pivot(index="Candidate Name", columns=["Keyword","Subject"], values="Count")
df = pd.MultiIndex.from_arrays([df['Subject'],df['Keyword']])
df = df['Candidate Name'],df['Count']
data = []
df = pd.DataFrame(columns=df,
                  data=[applicant[1:] for applicant in df],
                  index=pd.Index([applicant[0] for applicant in df], name='Candidate Name'))
df.sort_index(axis='columns', inplace=True)
print(df)
Akash
  • 37
  • 1
  • 6
  • 1
    Does this answer your question? [How can I pivot a dataframe?](https://stackoverflow.com/questions/47152691/how-can-i-pivot-a-dataframe) – Naveed Oct 18 '22 at 23:19

1 Answers1

0
df2=(df.pivot_table(index='Candidate_Name', 
                    columns=['Subject', 'Keyword'], 
                    values='Count', aggfunc=sum)
     .reset_index()
     .rename_axis(columns=[None, None])
    )
df2

            mandatory   optional
            aws     dotnet  ec2     devops  java    python
Candidate_Name                      
User1   0.0     0.0     1.0     1.0     1.0     1.0
User2   1.0     NaN     0.0     1.0     NaN     NaN
User3   0.0     NaN     1.0     1.0     1.0     NaN
Naveed
  • 11,495
  • 2
  • 14
  • 21