3

I have this data frame:

import pandas as pd

columns = ['ID','Data']
data = [['26A20',123],
        ['12A20',123],
        ['23A20',123]]
df = pd.DataFrame.from_records(data=data, columns=columns)

>>df
      ID  Data
0  26A20   123
1  12A20   123
2  23A20   123

And a simple task, to remove the A:s from ID when ID starts with 26 or 23:

df.loc[df['ID'].str.startswith(('23','26'))]['ID'] = df['ID'].str.replace('A','')

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

And nothing Changes:

>>df
      ID  Data
0  26A20   123
1  12A20   123
2  23A20   123

Im using loc, what am I doing wrong?

BERA
  • 1,345
  • 3
  • 16
  • 36

2 Answers2

3

Remove double ][ for avoid chained assignments:

df.loc[df['ID'].str.startswith(('23','26')), 'ID'] = df['ID'].str.replace('A','')
print (df)
      ID  Data
0   2620   123
1  12A20   123
2   2320   123

Also is possible filter in both sides for reduce execute of function replace:

mask = df['ID'].str.startswith(('23','26'))
df.loc[mask, 'ID'] = df.loc[mask, 'ID'].str.replace('A','')
print (df)
      ID  Data
0   2620   123
1  12A20   123
2   2320   123
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

And there is np.where() approach:

df['ID'] = np.where(df['ID'].str.startswith(('23','26')), df['ID'].str.replace('A', ''), df['ID'])
zipa
  • 27,316
  • 6
  • 40
  • 58