Python regex pattern which search for domain name

Question

I got a list of links and some of them look like

https://www.domainname
or https://domainname

I need to make a regex pattern to get only the domain name from it. This "www" make problems in my pattern :(

print(re.findall("//([a-zA-Z]+)", i))

You can create an optional non-capturing group - `re.findall(r"//(?:www\.)?([a-zA-Z]+)", i)` — Wiktor Stribiżew, Sep 02 '22 at 13:22
Maybe https://stackoverflow.com/questions/44021846/extract-domain-name-from-url-in-python can help — User, Sep 03 '22 at 00:05

score 0 · Answer 1 · answered Sep 02 '22 at 13:59

0

You could use the end of the string.

url = "https://www.domainname"
url2 = "https://domainname"


for u in [url, url2]:
    print(f'{u}')
    print(re.findall(r"\w+$", url2))

https://www.domainname
['domainname']
https://domainname
['domainname']

answered Sep 02 '22 at 13:59

LetzerWille

5,355
4
23
26

score 0 · Answer 2 · 2022-09-02T14:20:29.880

0

import re

with open('testfile.txt', 'r') as file:
    readfile = file.read()

    search = re.finditer('(?:\w+:\/\/)?(?:\w+\.)(\w+)(\.\w+)', readfile)

    for check in search:
        print(check.group(1)) #type 1 : if you want only domain names

result :

domainname
example

edited Sep 02 '22 at 14:20

answered Sep 02 '22 at 14:05

score 0 · Answer 3 · answered Sep 02 '22 at 14:07

0

My solution:

import re

l1 = ["https://www.domainname1", "https://domainname2"]
for i in l1:
    print(re.findall("/(?:www\.)?(\w+)", i))

Output:

['domainname1']
['domainname2']

answered Sep 02 '22 at 14:07

Shahab Rahnama

982
1
7
14

Python regex pattern which search for domain name

3 Answers3