1

I have a pandas dataframe

df = pd.DataFrame([{'a':'Male','c1':3,'c2':10},{'a':'Male','c1':3, 'c2':30},{'a':'Male','c1':1,'c2':20},{'a':'Female','c1':2,'c2':15},{'a':'Female','c1':2,'c2':100}])

I want to print the bellow:

   a      c1   c2
0  Male   3    10
1              30
2  Male   1    20
3  Female 2    15
4             100

Would you please help me?

Sayse
  • 42,633
  • 14
  • 77
  • 146
Ranajit
  • 51
  • 2
  • 5
  • 2
    Please add a description of your desired result, I'm guess you want to null the duplicated values – EdChum Sep 29 '15 at 09:21
  • 1
    Please provide code snippet of your current progress as a place to start. – konqi Sep 29 '15 at 09:22
  • 1
    Please don't update your question with changing requirements, you should state upfront either your real data or representative data relevant to your question, it's very poor form on SO – EdChum Sep 29 '15 at 09:44

1 Answers1

5

I don't know if you literally want a blank string or NaN but I'm using NaN here, you can test if a column has duplicated values using duplicated and set these to your desired result, by the way you need to add an explanation of what your desired result means rather us guess:

In [128]:
df.loc[df['c1'].duplicated(), 'c1'] = np.NaN
df

Out[128]:
   c1   c2
0   3   10
1 NaN   30
2   1   20
3   2   15
4 NaN  100

The blank string version:

In [131]:
df.loc[df['c1'].duplicated(), 'c1'] = ''
df

Out[131]:
  c1   c2
0  3   10
1      30
2  1   20
3  2   15
4     100

EDIT

You updated your question so I've updated my answer:

In [143]:
df.loc[(df['a'].duplicated() & df['c1'].duplicated()), ['a','c1']] = ''
df

Out[143]:
        a c1   c2
0    Male  3   10
1              30
2    Male  1   20
3  Female  2   15
4             100
EdChum
  • 376,765
  • 198
  • 813
  • 562