531

I would like to know how to check whether a string starts with "hello" in Python.

In Bash I usually do:

if [[ "$string" =~ ^hello ]]; then
 do something here
fi

How do I achieve the same in Python?

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
John Marston
  • 11,549
  • 7
  • 23
  • 18

5 Answers5

829
aString = "hello world"
aString.startswith("hello")

More info about startswith.

Georgy
  • 12,464
  • 7
  • 65
  • 73
RanRag
  • 48,359
  • 38
  • 114
  • 167
135

RanRag has already answered it for your specific question.

However, more generally, what you are doing with

if [[ "$string" =~ ^hello ]]

is a regex match. To do the same in Python, you would do:

import re
if re.match(r'^hello', somestring):
    # do stuff

Obviously, in this case, somestring.startswith('hello') is better.

Community
  • 1
  • 1
Shawabawa
  • 2,499
  • 1
  • 15
  • 17
  • 5
    Just wanted to add that for what I was doing, re.match and re.sub was always significantly slower than any other method. – Michał Leon May 07 '17 at 19:15
62

In case you want to match multiple words to your magic word, you can pass the words to match as a tuple:

>>> magicWord = 'zzzTest'
>>> magicWord.startswith(('zzz', 'yyy', 'rrr'))
True

startswith takes a string or a tuple of strings.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
user1767754
  • 23,311
  • 18
  • 141
  • 164
31

Can also be done this way..

regex=re.compile('^hello')

## THIS WAY YOU CAN CHECK FOR MULTIPLE STRINGS
## LIKE
## regex=re.compile('^hello|^john|^world')

if re.match(regex, somestring):
    print("Yes")
Aseem Yadav
  • 728
  • 7
  • 15
12

I did a little experiment to see which of these methods

  • string.startswith('hello')
  • string.rfind('hello') == 0
  • string.rpartition('hello')[0] == ''
  • string.rindex('hello') == 0

are most efficient to return whether a certain string begins with another string.

Here is the result of one of the many test runs I've made, where each list is ordered to show the least time it took (in seconds) to parse 5 million of each of the above expressions during each iteration of the while loop I used:

['startswith: 1.37', 'rpartition: 1.38', 'rfind: 1.62', 'rindex: 1.62']
['startswith: 1.28', 'rpartition: 1.44', 'rindex: 1.67', 'rfind: 1.68']
['startswith: 1.29', 'rpartition: 1.42', 'rindex: 1.63', 'rfind: 1.64']
['startswith: 1.28', 'rpartition: 1.43', 'rindex: 1.61', 'rfind: 1.62']
['rpartition: 1.48', 'startswith: 1.48', 'rfind: 1.62', 'rindex: 1.67']
['startswith: 1.34', 'rpartition: 1.43', 'rfind: 1.64', 'rindex: 1.64']
['startswith: 1.36', 'rpartition: 1.44', 'rindex: 1.61', 'rfind: 1.63']
['startswith: 1.29', 'rpartition: 1.37', 'rindex: 1.64', 'rfind: 1.67']
['startswith: 1.34', 'rpartition: 1.44', 'rfind: 1.66', 'rindex: 1.68']
['startswith: 1.44', 'rpartition: 1.41', 'rindex: 1.61', 'rfind: 2.24']
['startswith: 1.34', 'rpartition: 1.45', 'rindex: 1.62', 'rfind: 1.67']
['startswith: 1.34', 'rpartition: 1.38', 'rindex: 1.67', 'rfind: 1.74']
['rpartition: 1.37', 'startswith: 1.38', 'rfind: 1.61', 'rindex: 1.64']
['startswith: 1.32', 'rpartition: 1.39', 'rfind: 1.64', 'rindex: 1.61']
['rpartition: 1.35', 'startswith: 1.36', 'rfind: 1.63', 'rindex: 1.67']
['startswith: 1.29', 'rpartition: 1.36', 'rfind: 1.65', 'rindex: 1.84']
['startswith: 1.41', 'rpartition: 1.44', 'rfind: 1.63', 'rindex: 1.71']
['startswith: 1.34', 'rpartition: 1.46', 'rindex: 1.66', 'rfind: 1.74']
['startswith: 1.32', 'rpartition: 1.46', 'rfind: 1.64', 'rindex: 1.74']
['startswith: 1.38', 'rpartition: 1.48', 'rfind: 1.68', 'rindex: 1.68']
['startswith: 1.35', 'rpartition: 1.42', 'rfind: 1.63', 'rindex: 1.68']
['startswith: 1.32', 'rpartition: 1.46', 'rfind: 1.65', 'rindex: 1.75']
['startswith: 1.37', 'rpartition: 1.46', 'rfind: 1.74', 'rindex: 1.75']
['startswith: 1.31', 'rpartition: 1.48', 'rfind: 1.67', 'rindex: 1.74']
['startswith: 1.44', 'rpartition: 1.46', 'rindex: 1.69', 'rfind: 1.74']
['startswith: 1.44', 'rpartition: 1.42', 'rfind: 1.65', 'rindex: 1.65']
['startswith: 1.36', 'rpartition: 1.44', 'rfind: 1.64', 'rindex: 1.74']
['startswith: 1.34', 'rpartition: 1.46', 'rfind: 1.61', 'rindex: 1.74']
['startswith: 1.35', 'rpartition: 1.56', 'rfind: 1.68', 'rindex: 1.69']
['startswith: 1.32', 'rpartition: 1.48', 'rindex: 1.64', 'rfind: 1.65']
['startswith: 1.28', 'rpartition: 1.43', 'rfind: 1.59', 'rindex: 1.66']

I believe that it is pretty obvious from the start that the startswith method would come out the most efficient, as returning whether a string begins with the specified string is its main purpose.

What surprises me is that the seemingly impractical string.rpartition('hello')[0] == '' method always finds a way to be listed first, before the string.startswith('hello') method, every now and then. The results show that using str.partition to determine if a string starts with another string is more efficient then using both rfind and rindex.

Another thing I've noticed is that string.rfind('hello') == 0 and string.rindex('hello') == 0 have a good battle going on, each rising from fourth to third place, and dropping from third to fourth place, which makes sense, as their main purposes are the same.

Here is the code:

from time import perf_counter

string = 'hello world'
places = dict()

while True:
    start = perf_counter()
    for _ in range(5000000):
        string.startswith('hello')
    end = perf_counter()
    places['startswith'] = round(end - start, 2)

    start = perf_counter()
    for _ in range(5000000):
        string.rfind('hello') == 0
    end = perf_counter()
    places['rfind'] = round(end - start, 2)

    start = perf_counter()
    for _ in range(5000000):
        string.rpartition('hello')[0] == ''
    end = perf_counter()
    places['rpartition'] = round(end - start, 2)

    start = perf_counter()
    for _ in range(5000000):
        string.rindex('hello') == 0
    end = perf_counter()
    places['rindex'] = round(end - start, 2)
    
    print([f'{b}: {str(a).ljust(4, "4")}' for a, b in sorted(i[::-1] for i in places.items())])
Red
  • 26,798
  • 7
  • 36
  • 58