2

My dataset is in this form:

df = pd.DataFrame({'ID': [1,2,3,4],
                   'Type': ['A', 'B', 'B', 'B'],
                   'Value': [100, 200, 201, 120]})

I want to update the dataframe in the following way:

df = pd.DataFrame({'ID': [1,2,3,4],
                   'Type': ['A', 'B1', 'B2', 'B3'],
                   'Value': [100, 200, 201, 120]}) 

The code I was trying was:

df[df['Type'] == 'B', df['Value'] == 200] = 'B1'

But I'm getting error:

ValueError: Cannot reindex from a duplicate axis

Can someone please help me solve the problem?

Thanks!

The Singularity
  • 2,428
  • 3
  • 19
  • 48
Beta
  • 1,638
  • 5
  • 33
  • 67

3 Answers3

1

Try this instead:

df.loc[df['Type'].eq('B') & df['Value'].eq(200), 'Type'] = 'B1'
U13-Forward
  • 69,221
  • 14
  • 89
  • 114
1

You can use:

df['Type'] = df['Type'].mask(df['Type'].eq('B'),
                             df['Type'] + df.groupby('Type').cumcount().add(1).astype(str)
                            )
mozway
  • 194,879
  • 13
  • 39
  • 75
  • @mozway: Thanks a lot for your answer. It worked perfectly. But U12-Forward answer is correct, he shared first . So, will accept this answer. – Beta Sep 07 '21 at 11:10
  • @Beta sure, most important is you got your answer ;) – mozway Sep 07 '21 at 11:12
1

If need convert all B values by iterator starting by 1 use np.arange by count Trues by sum and join by +:

m = df['Type'] == 'B'
df.loc[m, 'Type'] += np.arange(1, m.sum()+1).astype(str)
print (df)

   ID Type  Value
0   1    A    100
1   2   B1    200
2   3   B2    201
3   4   B3    120
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252