-6

Which of the following regular expressions can be used to get the domain name?

I try the next code but it doesn't work, there is something that i'm doing wrong?

In the picture the another options

txt = 'I refer to https://google.com and i never refer http://www.baidu.com'
print(txt.findall(?<=https:\/\/)([A-Za-z0-9.]*))

enter image description here

Barmar
  • 741,623
  • 53
  • 500
  • 612
28Greka
  • 11
  • 2
  • 5

2 Answers2

0

Here's a regex that'll get your URLs

http(s?)://(?:[a-z0-9](?:[a-z0-9-]{0,61}[a-z0-9])?\.)+[a-z0-9][a-z0-9-]{0,61}[a-z0-9]

It'll work for https://stackoverflow.com, http://example.com, https://example.com etc...

If you don't want the http or https just use this:

(?:[a-z0-9](?:[a-z0-9-]{0,61}[a-z0-9])?\.)+[a-z0-9][a-z0-9-]{0,61}[a-z0-9]
Halmon
  • 1,027
  • 7
  • 13
  • The question was updated with a multiple-choice quiz. He's not looking for a general solution, just the best one of the choices. – Barmar Nov 03 '20 at 02:41
0

You selected the correct regexp, you just have to quote it to use it in Python. You also need to call re.findall(), it's not a string method.

import re

txt = 'I refer to https://google.com and i never refer http://www.baidu.com'
print(re.findall(r'(?<=https:\/\/)([A-Za-z0-9.]*)', txt))
Barmar
  • 741,623
  • 53
  • 500
  • 612