Checking whether a string starts with XXXX

Question

I would like to know how to check whether a string starts with "hello" in Python.

In Bash I usually do:

if [[ "$string" =~ ^hello ]]; then
 do something here
fi

How do I achieve the same in Python?

score 829 · Accepted Answer · edited Feb 04 '20 at 17:17

829

aString = "hello world"
aString.startswith("hello")

More info about startswith.

edited Feb 04 '20 at 17:17

Georgy

12,464
7
65
73

answered Jan 10 '12 at 11:56

RanRag

48,359
38
114
167

score 135 · Answer 2 · edited May 23 '17 at 12:18

135

RanRag has already answered it for your specific question.

However, more generally, what you are doing with

if [[ "$string" =~ ^hello ]]

is a regex match. To do the same in Python, you would do:

import re
if re.match(r'^hello', somestring):
    # do stuff

Obviously, in this case, somestring.startswith('hello') is better.

edited May 23 '17 at 12:18

Community

1
1

answered Jan 10 '12 at 12:11

Shawabawa

2,499
1
15
17

5

Just wanted to add that for what I was doing, re.match and re.sub was always significantly slower than any other method. – Michał Leon May 07 '17 at 19:15

score 62 · Answer 3 · edited Jun 17 '20 at 00:32

62

In case you want to match multiple words to your magic word, you can pass the words to match as a tuple:

>>> magicWord = 'zzzTest'
>>> magicWord.startswith(('zzz', 'yyy', 'rrr'))
True

startswith takes a string or a tuple of strings.

edited Jun 17 '20 at 00:32

wjandrea

28,235
9
60
81

answered Nov 10 '17 at 19:16

user1767754

23,311
18
141
164

score 31 · Answer 4 · answered Sep 24 '16 at 10:17

31

Can also be done this way..

regex=re.compile('^hello')

## THIS WAY YOU CAN CHECK FOR MULTIPLE STRINGS
## LIKE
## regex=re.compile('^hello|^john|^world')

if re.match(regex, somestring):
    print("Yes")

answered Sep 24 '16 at 10:17

Aseem Yadav

728
7
15

Red · Answer 5 · 2021-11-10T23:29:54.100

I did a little experiment to see which of these methods

string.startswith('hello')
string.rfind('hello') == 0
string.rpartition('hello')[0] == ''
string.rindex('hello') == 0

are most efficient to return whether a certain string begins with another string.

Here is the result of one of the many test runs I've made, where each list is ordered to show the least time it took (in seconds) to parse 5 million of each of the above expressions during each iteration of the while loop I used:

['startswith: 1.37', 'rpartition: 1.38', 'rfind: 1.62', 'rindex: 1.62']
['startswith: 1.28', 'rpartition: 1.44', 'rindex: 1.67', 'rfind: 1.68']
['startswith: 1.29', 'rpartition: 1.42', 'rindex: 1.63', 'rfind: 1.64']
['startswith: 1.28', 'rpartition: 1.43', 'rindex: 1.61', 'rfind: 1.62']
['rpartition: 1.48', 'startswith: 1.48', 'rfind: 1.62', 'rindex: 1.67']
['startswith: 1.34', 'rpartition: 1.43', 'rfind: 1.64', 'rindex: 1.64']
['startswith: 1.36', 'rpartition: 1.44', 'rindex: 1.61', 'rfind: 1.63']
['startswith: 1.29', 'rpartition: 1.37', 'rindex: 1.64', 'rfind: 1.67']
['startswith: 1.34', 'rpartition: 1.44', 'rfind: 1.66', 'rindex: 1.68']
['startswith: 1.44', 'rpartition: 1.41', 'rindex: 1.61', 'rfind: 2.24']
['startswith: 1.34', 'rpartition: 1.45', 'rindex: 1.62', 'rfind: 1.67']
['startswith: 1.34', 'rpartition: 1.38', 'rindex: 1.67', 'rfind: 1.74']
['rpartition: 1.37', 'startswith: 1.38', 'rfind: 1.61', 'rindex: 1.64']
['startswith: 1.32', 'rpartition: 1.39', 'rfind: 1.64', 'rindex: 1.61']
['rpartition: 1.35', 'startswith: 1.36', 'rfind: 1.63', 'rindex: 1.67']
['startswith: 1.29', 'rpartition: 1.36', 'rfind: 1.65', 'rindex: 1.84']
['startswith: 1.41', 'rpartition: 1.44', 'rfind: 1.63', 'rindex: 1.71']
['startswith: 1.34', 'rpartition: 1.46', 'rindex: 1.66', 'rfind: 1.74']
['startswith: 1.32', 'rpartition: 1.46', 'rfind: 1.64', 'rindex: 1.74']
['startswith: 1.38', 'rpartition: 1.48', 'rfind: 1.68', 'rindex: 1.68']
['startswith: 1.35', 'rpartition: 1.42', 'rfind: 1.63', 'rindex: 1.68']
['startswith: 1.32', 'rpartition: 1.46', 'rfind: 1.65', 'rindex: 1.75']
['startswith: 1.37', 'rpartition: 1.46', 'rfind: 1.74', 'rindex: 1.75']
['startswith: 1.31', 'rpartition: 1.48', 'rfind: 1.67', 'rindex: 1.74']
['startswith: 1.44', 'rpartition: 1.46', 'rindex: 1.69', 'rfind: 1.74']
['startswith: 1.44', 'rpartition: 1.42', 'rfind: 1.65', 'rindex: 1.65']
['startswith: 1.36', 'rpartition: 1.44', 'rfind: 1.64', 'rindex: 1.74']
['startswith: 1.34', 'rpartition: 1.46', 'rfind: 1.61', 'rindex: 1.74']
['startswith: 1.35', 'rpartition: 1.56', 'rfind: 1.68', 'rindex: 1.69']
['startswith: 1.32', 'rpartition: 1.48', 'rindex: 1.64', 'rfind: 1.65']
['startswith: 1.28', 'rpartition: 1.43', 'rfind: 1.59', 'rindex: 1.66']

I believe that it is pretty obvious from the start that the startswith method would come out the most efficient, as returning whether a string begins with the specified string is its main purpose.

What surprises me is that the seemingly impractical string.rpartition('hello')[0] == '' method always finds a way to be listed first, before the string.startswith('hello') method, every now and then. The results show that using str.partition to determine if a string starts with another string is more efficient then using both rfind and rindex.

Another thing I've noticed is that string.rfind('hello') == 0 and string.rindex('hello') == 0 have a good battle going on, each rising from fourth to third place, and dropping from third to fourth place, which makes sense, as their main purposes are the same.

Here is the code:

from time import perf_counter

string = 'hello world'
places = dict()

while True:
    start = perf_counter()
    for _ in range(5000000):
        string.startswith('hello')
    end = perf_counter()
    places['startswith'] = round(end - start, 2)

    start = perf_counter()
    for _ in range(5000000):
        string.rfind('hello') == 0
    end = perf_counter()
    places['rfind'] = round(end - start, 2)

    start = perf_counter()
    for _ in range(5000000):
        string.rpartition('hello')[0] == ''
    end = perf_counter()
    places['rpartition'] = round(end - start, 2)

    start = perf_counter()
    for _ in range(5000000):
        string.rindex('hello') == 0
    end = perf_counter()
    places['rindex'] = round(end - start, 2)
    
    print([f'{b}: {str(a).ljust(4, "4")}' for a, b in sorted(i[::-1] for i in places.items())])

Checking whether a string starts with XXXX

5 Answers5

Linked

Related