in this dataframe, some values in the 'artists' column contain two artists indicated with a "," separating their names. I am trying to remove these rows and replace with a new row for each of the artists names that are separated by the comma.
Basically I am trying to find rows which meet a criteria:
featured_artists_index = raw_data_df['artists'].str.contains(',').tolist()
and make a new row for each individual artist that is separated by the comma:
new_rows = []
for idx,row in raw_data_df.loc[featured_artists_index].iterrows():
row = row.copy()
for artist in row['artists'].split(','):
row['artists'] = artist
new_rows.append(row)
then remove the original rows and append the modified rows:
raw_data_df.drop(raw_data_df.index[featured_artists_index], inplace=True)
raw_data_df.append(new_rows)
But this solution is pretty slow and am wondering if there are pandas functions that might make this more efficient and are better fitting for this task.
Thanks!