Duplicate rows in pandas on condition

Question

I have this pandas dataframe:

    Column1   Column2    Column3
1     A                     C
2     A         D   
3     B

If the is a "D" in my column2 i want to duplicate the row with is values and reset the index like this :

    Column1  Column2    Column3
1     A                    C
2     A         D   
3     A         D   
4     B

How do I do this in pandas?

jezrael · Accepted Answer · 2023-04-05T10:40:32.783

1

First test if duplicated columns names, if necessary deduplicate them:

print (df.columns[df.columns.duplicated(keep=False)])

Then join filtered rows with concat and sort indices:

df = df.reset_index(drop=True)

df = (pd.concat([df, df[df['Column2'].eq('D')]])
        .sort_index(kind='stable', ignore_index=True)
        .rename(lambda x: x+1))
print (df)
  Column1 Column2 Column3
1       A     NaN       C
2       A       D     NaN
3       A       D     NaN
4       B     NaN     NaN

edited Apr 05 '23 at 10:40

answered Apr 05 '23 at 10:19

jezrael

822,522
95
1,334
1,252

thanks for your answer "ValueError: cannot reindex on an axis with duplicate labels" how I can avoid? – Simon GIS Apr 05 '23 at 10:22
@SimonGIS - Is possible create default index first `df = df.reset_index(drop=True)`? – jezrael Apr 05 '23 at 10:26
Still get "ValueError: cannot reindex on an axis with duplicate labels." I have been stuck for hours on this. Is there an alternative solution? My indexes are not duplicated. I don't quite get it. – Simon GIS Apr 05 '23 at 10:29
1

@SimonGIS - Is possible duplciated columns names? `print (df.columns[df.columns.duplicated(keep=False)])` ? – jezrael Apr 05 '23 at 10:31

Duplicate rows in pandas on condition

1 Answers1