0

I have a dataframe in a similar format to the below;

Sites            Points        Value
ABC,DEF          2             10
GHI, JKL,MNO     4             15

I want to create new rows for each of the points, which I have been able to do with the following code (disclaimer: I found this code but can't find to reference);

def duplicate_rows(df, countcol):
for _, row in df.iterrows():
    for i in range(int(row[countcol])-1):
        df = df.append(row)

df.sort_index(inplace=True)
return df


df_dup = duplicate_rows(base_df,'Points')

This gives an output of

Sites          Points     Values
ABC, DEF       2          10
ABC, DEF       2          10
GHI, JKL, MNO  4          15
GHI, JKL, MNO  4          15
GHI, JKL, MNO  4          15
GHI, JKL, MNO  4          15

What I am trying to achieve is;

Sites          Points     Values
ABC            2          10
DEF            2          10
GHI            4          15
JKL            4          15
MNO            4          15
MNO            4          15

Any ideas on how this could be achieved?

Sloth87
  • 435
  • 5
  • 8
  • Why do you have double `MNO`? – Quang Hoang Feb 24 '20 at 21:24
  • Does this answer your question? [Split (explode) pandas dataframe string entry to separate rows](https://stackoverflow.com/questions/12680754/split-explode-pandas-dataframe-string-entry-to-separate-rows) – Gary Feb 24 '20 at 21:37

2 Answers2

2

IIUC, and if you works on pandas 0.25+:

(df.assign(Sites=df.Sites.str.split(',\s*'))
   .explode('Sites')
)

Output:

  Sites  Points  Value
0   ABC       2     10
0   DEF       2     10
1   GHI       4     15
1   JKL       4     15
1   MNO       4     15
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
1

You can use DataFrame.explode,

df['Sites'] = df['Sites'].str.split(',')
df.explode('Sites').reset_index(drop = True)


    Sites   Points  Value
0   ABC     2       10
1   DEF     2       10
2   GHI     4       15
3   JKL     4       15
4   MNO     4       15
Vaishali
  • 37,545
  • 5
  • 58
  • 86