I'm using the following line of regex which I found from this SO answer:
(?:[\w[a-z]-]+:(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.-]+[.??][a-z]{2,4}/)(?:[^\s()<>]+|(([^\s()<>]+|(([^\s()<>]+)))))+(?:(([^\s()<>]+|(([^\s()<>]+))))|[^\s`!()[]{};:'".,<>?«»“”‘’])
I am testing it on the following string:
"Quattro Amici in Concert Mar. 3, 2014. Long-time collaborators Lun Jiang, violin; Roberta Zalkind, viola; Pegsoon Whang, cello; and Karlyn Bond, piano, will perform works by Franz Joseph Haydn, Wolfgang Amadeus Mozart, Ludwig van Beethoven and Gabriel Faure. To purchase tickets visit westminstercollege.edu/culturalevents or call 801-832-2457. - See more at: http://entertainment.sltrib.com/events/view/quattro_amici_in_concert#sthash.QRsLXXiA.dpuf"
I'm simply attempting to extract urls from strings and based on a bunch of SO answers, I've found that regex is the recommended tool for that job. I'm not a regex expert (or even intermediate in my understanding), so I'm baffled by the empty strings my re.findall() keeps returning. I've stepped through the regex line using regex buddy and still no luck. Any help would be hugely appreciated.