Stripping a string and getting start index and end index

Question

Is there any straightforward way in Python to strip a string and get the start index and the end index?

Example: Given the string ' hello world! ', I want to the stripped string 'hello world!' As well as the start index 2 and the and index 14.

' hello world! '.strip() only returns the stripped string.

I could write a function:

def strip(str):
    '''
    Take a string as input.
    Return the stripped string as well as the start index and end index.
    Example: '  hello world!   '  --> ('hello world!', 2, 14)
    The function isn't computationally efficient as it does more than one pass on the string.
    '''
    str_stripped = str.strip()
    index_start = str.find(str_stripped)
    index_end = index_start + len(str_stripped)
    return str_stripped, index_start, index_end

def main():
    str = '  hello world!   '
    str_stripped, index_start, index_end = strip(str)
    print('index_start: {0}\tindex_end: {1}'.format(index_start, index_end))

if __name__ == "__main__":
    main()

but I wonder whether Python or one popular library provides any built-in way to do so.

I don't think there's a built-in way. Your code is really concise, it's really just the three lines of `str_stripped = str.strip()`, `index_start = str.find(str_stripped)`, and `index_end = index_start + len(str_stripped)`. All the rest is superfluous. — Luke Taylor, Mar 27 '16 at 15:57
@LukeTaylor: It's concise but as the comments say he's doing more than one pass on the string. Of course you could code a `strip()` function that returns the desired output while doing just one pass, though. — Chuck, Mar 27 '16 at 16:04

score 6 · Accepted Answer · edited May 23 '17 at 11:50

6

One option (probably not the most straight-forward) would be to do it with regular expressions:

>>> import re
>>> s = '  hello world!   '
>>> match = re.search(r"^\s*(\S.*?)\s*$", s)
>>> match.group(1), match.start(1), match.end(1)
('hello world!', 2, 14)

where in ^\s*(\S.*?)\s*$ pattern:

^ is a beginning of a string
\s* zero or more space characters
(\S.*?) is a capturing group that would capture a non-space character followed by any characters any number of times in a non-greedy fashion
$ is an end of a string

edited May 23 '17 at 11:50

Community

1
1

answered Mar 27 '16 at 16:05

alecxe

462,703
120
1,088
1,195

2

This won't work correctly if there is no whitespace to be stripped. Using `'\s*'` instead should help – user2390182 Mar 27 '16 at 16:15
Was going to post something similar, it seems to actually be the fastest way on the tests I did but only if you compile first otherwise it is quite a bit slower than the OP's own code – Padraic Cunningham Mar 27 '16 at 16:25

Tom Karzes · Answer 2 · 2016-03-27T16:12:43.857

3

The most efficient way to do this is by invoking lstrip and rstrip separately. For example:

s = '  hello world!   '
s2 = s.lstrip()
s3 = s2.rstrip()
ix = len(s) - len(s2)
ix2 = len(s3) + ix

This gives:

>>> s3
'hello world!'
>>> ix
2
>>> ix2
14
>>>

edited Mar 27 '16 at 16:12

answered Mar 27 '16 at 16:05

Tom Karzes

22,815
2
22
41

score 0 · Answer 3 · answered Mar 28 '16 at 00:16

0

In fact you have the necessary methods to accomplish this task. strip, find and len are all you need.

s = '  hello world!   '
s1 = s.strip()
first_index = s.find(s1)
end_index = first_index + len(s1) - 1

answered Mar 28 '16 at 00:16

Farhad Maleki

3,451
1
25
20

Stripping a string and getting start index and end index

3 Answers3

Linked