1

I'm new to Python and would appreciate if someone could please help on this:

I have a dataset that looks like this:

Celebrity1 Celebrity2 Celebrity3 Votes
Brad Pitt Angelina Jolie - 49
Will Smith Chris Rock Justin Bieber 55

I want to "flatten" this database so it looks like this:

Celebrity Votes
Brad Pitt 49
Angelina Jolie 49
Will Smith 55
Chris Rock 55
Justin Bieber 55

How can I do this in Python?

#here is the code for the initial dataframe

data = {'Celebrity1': ['Brad Pitt','Will Smith'], 'Celebrity2': ['Angelina Jolie', 'Chris Rock'],'Celebrity3': [None,'Justin Bieber'],'Votes': [49,55]}  
df = pd.DataFrame(data)
MB1001
  • 61
  • 5

1 Answers1

1

You could use wide_to_long:

out = (pd.wide_to_long(df.reset_index(), stubnames='Celebrity', i='index', j='j')
       .dropna().sort_values(by='Votes').reset_index(drop=True))

Another option is to use set_index + stack:

df = df.set_index('Votes').stack().droplevel(-1).reset_index(name='Celebrity')[['Celebrity','Votes']]

Output:

        Celebrity  Votes
0       Brad Pitt     49
1  Angelina Jolie     49
2      Will Smith     55
3      Chris Rock     55
4   Justin Bieber     55