A reasonably robust way is to take your chunk of text and search it with a URL-matching regex pattern.
See also:
Using regex...
import re
re.search(pattern, text)
... or
re.findall(pattern, text)
A full example...
>>> p = re.compile(r'(?i)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))')
or
>>> p = re.compile('(?i)\\b((?:https?://|www\\d{0,3}[.]|[a-z0-9.\\-]+[.][a-z]{2,4}/)(?:[^\\s()<>]+|\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\))+(?:\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\)|[^\\s`!()\\[\\]{};:\\\'".,<>?\xc2\xab\xc2\xbb\xe2\x80\x9c\xe2\x80\x9d\xe2\x80\x98\xe2\x80\x99]))')
>>> m = p.search("javascript:changeChannel('http://some-server.com/with1234init.also', 20);")
>>> m.group()
'http://some-server.com/with1234init.also'
the pattern used is from the web URL version in the above link
Note the use of the r
prefix and the escaped '
quote towards the end in the first pattern.
using re.compile caches the regex pattern