0

I am trying to remove duplicate values in pandas and replace those values with an empty value.

Originally, I had those A values on Header A (Column A) and I would like remove those A values and replace A with an empty string ""

Header A Header B
A B
A C
A D
A E
A F

To this:

Header A Header B
A B
C
D
E
F

How do I do this in Pandas using Python? Those values are from csv file.

2 Answers2

2

Use:

df.loc[df['Header A'].duplicated(), 'Header A'] = ''
print (df)
  Header A Header B
0        A        B
1                 C
2                 D
3                 E
4                 F
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

Replace with NaN:

df.loc[df['Header A'].duplicated(), 'Header A'] = np.NaN

Replace with empty string:

df.loc[df['Header A'].duplicated(), 'Header A'] = "" 

if you want it another columns as well:

df.loc[(df['Header A'].duplicated() & df['Header B'].duplicated()), ['Header A','Header B']] = ''