8

Hi everyone! I am trying to debug someones code and I have found the problem. The program loops through an array of strings and count's certain ends. The problem is that some of these strings end with _, so the counting goes wrong. I would like to use regex, but I am not experienced enough. Could someone help me?

I would like to loop through the array and per string check if it ends with _('s) and trim all of these _ off to put them again in the array!

Update

Thanks for the rstrip suggestion! I have tried to write a code that works with my data, but no luck yet...

data_trimmed = []
        for x in data:
            x.rstrip('_')
            data_trimmed.append(x)

        print(data_trimmed)

But this still returns: ['Anna__67_______', 'Dyogo_3__', 'Kiki_P1_', 'BEN_40001__', .... ]

Anna Jeanine
  • 3,975
  • 10
  • 40
  • 74
  • 6
    You can do `rstrip('_')` to remove trailing underscores so `some_string.rstrip('_')` – EdChum Nov 22 '16 at 11:27
  • Will this remove all `_`'s in the string or just at the end? – Anna Jeanine Nov 22 '16 at 11:29
  • Just at the end, try it: `'__as_das___'.rstrip('_')` – EdChum Nov 22 '16 at 11:30
  • What Ed said. There's no need to bother testing for trailing underscores, just call [`.rstrip`](https://docs.python.org/3/library/stdtypes.html#str.rstrip) on every line. It can test for the specified chars at C speed faster than you can do it with an explicit Python test. – PM 2Ring Nov 22 '16 at 11:32
  • You can use a list comprehension to modify all strings in your list, see my updated answer – EdChum Nov 22 '16 at 11:41
  • 1
    `x.rstrip('_')` doesn't modify `x`: Python strings are _immutable_, so string methods can't modify the original, they have to return a new string. On a related note, please see [this excellent answer](http://stackoverflow.com/a/29604031/4014959) by abarnert. – PM 2Ring Nov 22 '16 at 11:44

2 Answers2

9

You can use rstrip('_') to remove trailing underscores:

In [15]:
'__as_das___'.rstrip('_')

Out[15]:
'__as_das'

So you can see that any leading underscores and any in the middle of the string are unaffected, see the docs: https://docs.python.org/2/library/string.html#string-functions

To answer your updated question you can use a list comprehension to update each string in the list:

In [18]:
a = ['Anna__67_______', 'Dyogo_3__', 'Kiki_P1_', 'BEN_40001__']
a = [x.rstrip('_') for x in a]
a

Out[18]:
['Anna__67', 'Dyogo_3', 'Kiki_P1', 'BEN_40001']
EdChum
  • 376,765
  • 198
  • 813
  • 562
4

use string rstrip method to strip off unwanted _

s = 'anything__'
s = s.rstrip('_') # s becomes 'anything'

regex is a bit overkill for this, it can be done as below

import re
s = 'anything__'
s = re.sub('_+$', '', s)  # s becomes 'anything'
Skycc
  • 3,496
  • 1
  • 12
  • 18