2

In Python, if I have a URL, what is the simplest way to turn something like:

http://stackoverflow.com

into:

<a href="http://stackoverflow.com">http://stackoverflow.com</a>

So far I have tied a lot with Regular Expressions, but nothing works at all.

ChrisGPT was on strike
  • 127,765
  • 105
  • 273
  • 257
Progo
  • 3,452
  • 5
  • 27
  • 44
  • 1
    I am downvoting because the example information you provided does not match what you actually wanted for an answer. – SethMMorton Feb 21 '15 at 02:13
  • possible duplicate of [replace URLs in text with links to URLs](http://stackoverflow.com/questions/1727535/replace-urls-in-text-with-links-to-urls) – Waylan Feb 21 '15 at 03:42

5 Answers5

3

You can use str.format:

>>> link = 'http://stackoverflow.com'
>>> print('<a href="{0}">{0}</a>'.format(link))
<a href="http://stackoverflow.com">http://stackoverflow.com</a>
>>>

Note however that you need to number the format fields since you are repeating an argument.

  • What if I have a string containing multiple urls and non-urls? – Progo Feb 21 '15 at 02:07
  • I'm not sure what you mean by that. Could you give an example input/output? –  Feb 21 '15 at 02:11
  • @Progo This answer meets the criteria as given in the original question. If your criteria is more complicated then you need to give that information as well. – SethMMorton Feb 21 '15 at 02:12
  • input: "Hey bob, go to http://stackexchange.com, they have cool sites like http://stackoverflow.com". The two links would be formatted. – Progo Feb 21 '15 at 02:12
  • @SethMMorton This is true. I will accept the answer. I have to wait 2 minutes though – Progo Feb 21 '15 at 02:13
  • @Progo - That's significantly different than what you said in your question. You might be able to use a Regex: `re.sub('(\w+\.com)', r'\1', data)` where `data` is your input string. Of course, this is a very simple solution. You will probably want to make the pattern a lot more robust. That, or look into libraries designed to work with HTML or URLs. –  Feb 21 '15 at 02:18
  • @iCodez Sorry... I didn't phrase the question as well as I would like to have. Your solution works very well. Thanks! – Progo Feb 21 '15 at 02:19
  • @iCodez this will not work with links that do not end with `.com` or that specify a sub resource: `www.stackoverflow.com/questions/28641210` EDIT: looks like i read over the second statement of your comment. – syntonym Feb 21 '15 at 02:27
  • not working for urls like `http://my?param1&param2` – em2er Mar 03 '21 at 16:27
3

You could use regex.

>>> import re
>>> s = "http://stackoverflow.com www.foo.com"
>>> re.sub(r'\b((?:https?:\/\/)?(?:www\.)?(?:[^\s.]+\.)+\w{2,4})\b', r'<a href="\1">\1</a>', s)
'<a href="http://stackoverflow.com">http://stackoverflow.com</a> <a href="www.foo.com">www.foo.com</a>'
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
  • Thanks for the answer. It works, but iCodez deserves the excepted answer. Thanks though. – Progo Feb 21 '15 at 02:21
  • seems like you're trying to search and replace links in an input string within a tag. – Avinash Raj Feb 21 '15 at 02:23
  • Correct. As I said, your answer works, but since I didn't phrase my original question very well and iCodez answered the original question **and** answered this one in a comment, he deserves the accept. +1 though. – Progo Feb 21 '15 at 02:25
1

You could use a library to convert URL to HTML tag, e.g. bleach:

from bleach import linkify
url = 'https://stackoverflow.com?with=param1&other=param2'
atag = linkify(url)
print(atag)

will output

<a href="https://stackoverflow.com?with=param1&amp;other=param2" rel="nofollow">https://stackoverflow.com?with=param1&amp;other=param2</a>
em2er
  • 811
  • 5
  • 15
0
url = 'http://stackoverflow.com'
reference = '<a href=\"'+ url + '\">' + url + '</a>'
user14241
  • 727
  • 1
  • 8
  • 27
0

Assuming you already have that string in a variable, or can get it, I'd suggest you use string.format.

link = '<a href="{0}">{0}</a>'
stackOverflowLink = link.format("http://stackoverflow.com")

stackOverflowLink will then contain
<a href="http://stackoverflow.com">http://stackoverflow.com</a>

David
  • 4,744
  • 5
  • 33
  • 64