find all website addresses in the input text (Python)

Question

I need to find all website addresses in the input text and print all addresses in the order they appear in the text, each on a new line. "https: //" "http: //" "www."

I used split in the string, but I can't return that start with this 'www'. Can someone explain to me how can I solve this?

Sample Input 1:

WWW.GOOGLE.COM uses 100-percent renewable energy sources and www.ecosia.com plants a tree for every 45 searches!

Sample Output 1:

WWW.GOOGLE.COM

www.ecosia.com

text = input()
text = text.lower()
words = text.split(" ")
for word in words:

You should take a look at this [(How to ask good question)](https://stackoverflow.com/help/how-to-ask).Your question is not clear. — Nurqm, Sep 02 '20 at 15:24

score 0 · Answer 1 · answered Sep 02 '20 at 04:16

what i would do is to catch the "www" couse' we know every url beggins with that , and end with an spacebar, so put everything in and array and then print it, but python has a lot of string functions in its library but i don't know many of that.

str = " www.GOOGLE.COM uses 100-percent renewable energy sources and www.ecosia.com plants a tree for every 45 searches! "
str.lower()
tmp = ""
all_url = []
k=0
for i in range(len(str)-3):
    if(str[i]+str[i+1]+str[i+2] == "www"):
        k=i+4
        while(str[k] != " "):
            tmp=tmp+str[k]
            k+=1
        all_url.append(tmp)
        tmp = ""
        i=k
for url in all_url:
    print("www." + url )

score 0 · Accepted Answer · answered Sep 02 '20 at 05:40

A better way is to use Regex. You can learn more good regex pattern from this

import re
url_regex = r"(?i)(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9]+\.[^\s]{2,}|www\.[a-zA-Z0-9]+\.[^\s]{2,})"
raw_string = "WWW.GOOGLE.COM uses 100-percent renewable energy sources and www.ecosia.com plants a tree for every 45 searches!"
urls = re.findall(url_regex, raw_string)

find all website addresses in the input text (Python)

2 Answers2