re.search() or 'in', re.match() or startswith()?

Question

I am learning how to use the re library in Python and a question flashed through my mind. Please forgive me if this sounds stupid. I am new to this stuff. :)

Since according to this answer,

re.search - find something anywhere in the string
re.match - find something at the beginning of the string

Now I have this code:

from re import search
str = "Yay, I am on StackOverflow. I am overjoyed!"
if search('am',str): # not considering regex
    print('True') # returns True
if 'am' in str:
    print('True') # returns True

And this:

from re import match
str = "Yay, I am on Stack Overflow. I am overjoyed!"
if match('Yay',str): # not considering regex
    print('True') # prints True
if str.startswith('Yay'):
    print('True') # prints True

So now my question is, which one should I use when I am doing similar stuffs (not considering regular expressions) such as fetching contents from a webpage and finding in its contents. Should I use built-ins like above, or the standard re library? Which one will make the code more optimised/efficient?

Any help will be much appreciated. Thank you!

For webpage, regexp are not advised, not robust enough. Use html parser like beautifulsoup. — Corentin Limier, Oct 21 '18 at 15:06
For simple searches, regex is overkill. *startswith* and *endswith* also take a tuple of options, which makes them very flexible - for example, if matching filename extensions, you can use **filename.endswith(('.jpg', '.png', '.svg'))**. — Mario Camilleri, Oct 21 '18 at 15:08

score 0 · Answer 1 · answered Oct 21 '18 at 15:16

Regex is mostly used for complex match, search and replace operations, while built-in keyword such as 'in' is mostly used for simple operations like replacing a single word by another. Normally 'in' keyword is preferred. In terms of performance 'in' keyword usage is faster but when you face a situation where you could use 'in' keyword but Regex offers much more elegant solution rather than typing a lot of 'if' statements use Regex.

When you are fetching contents from a webpage and finding stuff in the contents the codex above also applies.

Hope this helps.

`in` is not really helpful for replacing a word since it will only give you a Boolean result, it doesn't capture the actual word. Also, neither are useful for parsing something like HTML from a web page. — roganjosh, Oct 21 '18 at 16:36

re.search() or 'in', re.match() or startswith()?

1 Answers1