1

Dataframe:

MovieID movieCater
1 Action, Comedy, Adventure
2 Action, Crime
3 Crime

What I want:

MovieID movieCater Action Comedy Adventure Crime
1 Action, Comedy, Adventure 1 1 1 0
2 Action, Crime 1 0 0 1
3 Crime 0 0 0 1

my data frame does not include action, comedy, and others columns. is there any method that makes that happen? for example the first of movieCater include action, comedy, and adventure. then go to the corresponding column name and set it to 1.

Leo
  • 71
  • 7

1 Answers1

1

Try this:

df_original = df.copy()
df['movieCater'] = df['movieCater'].str.split(', ')
df = df.explode('movieCater')
df['value'] = 1
df_original.join(df.pivot(columns=['movieCater'], values=['value']).fillna(0).droplevel(0,axis=1))

#   MovieID                 movieCater  Action  Adventure  Comedy  Crime
# 0        1  Action, Comedy, Adventure     1.0        1.0     1.0    0.0
# 1        2              Action, Crime     1.0        0.0     0.0    1.0
# 2        3                      Crime     0.0        0.0     0.0    1.0
Andreas
  • 8,694
  • 3
  • 14
  • 38
  • hi, i update the question, would you mind help me to fix it out again? sry, I am poor on python. – Leo May 21 '21 at 18:10
  • Hi @Leo, sure thing, but please keep the question as it is, because the answer is for the original question and future readers might have the same "original" problem, so it might help them. Just undo the edit and create a new question, then just write the link to the question here and I will have a look. – Andreas May 21 '21 at 19:09
  • https://stackoverflow.com/questions/67645905/how-i-match-the-value-and-assign-to-them-a-new-column-based-on-other-column-stri – Leo May 22 '21 at 03:42