How do you remove parts of a string in a column using rstrip in Pandas?

Question

Code before rstrip

column_names = lh_Area_Base_V2.columns.tolist()
for i, val in enumerate(column_names[1:]):
    column_names[i+1] += '_Base_V2'
column_names[0] = 'Subj_ID'
# Replace the column names with a new name
lh_Area_Base_V2.columns = column_names
lh_Area_Base_V2.head()

Code with rstrip (to drop "_V2" from the end of first column values):

column_names = lh_Area_Base_V2.columns.tolist()
for i, val in enumerate(column_names[1:]):
    column_names[i+1] += '_Base_V2'
column_names[0] = 'Subj_ID'
lh_Area_Base_V2['Subj_ID'] = lh_Area_Base_V2['Subj_ID'].map(lambda x: x.lstrip().rstrip('_V2'))
# Replace the column names with a new name
lh_Area_Base_V2.columns = column_names
lh_Area_Base_V2.head()

Error: Why does ID index #1 have a value 2 dropped at the end, which was not requested by the rstrip function (the function only requested for "_V2" to be dropped)?

I would love to hear any suggestions for fixes.

`rstrip()` removes *any* from the character list, not *all*. For example `"Hello".rstrip('Ao2')` returns `'Hell'`. — chrisaycock, Aug 23 '18 at 16:17
I'm marking this as a duplicate since Pandas `str` method behaviour mimics exactly regular Python string methods. — jpp, Aug 23 '18 at 16:20

score 3 · Accepted Answer · answered Aug 23 '18 at 16:17

This is expected behavior of rstrip:

The chars argument is a string specifying the set of characters to be removed

It is not just stripping the string _V2, it will strip any of the contained characters, including the 2 at the end of your second row.

Instead, you may use a regular expression to replace a trailing _V2:

df.assign(Subj_ID=df.Subj_ID.str.replace(r'_V2$', ''))

    Subj_ID  lh_bankssts_area_base_V2
0  SILVA001                       861
1  SILVA002                      1051
2  SILVA004                      1127
3  SILVA005                      1346
4  SILVA007                      1209

How do you remove parts of a string in a column using rstrip in Pandas?

1 Answers1