1

I'm trying to build a blog system. So I need to do things like transforming '\n' into < br /> and transform http://example.com into < a href='http://example.com'>http://example.com< /a>

The former thing is easy - just using string replace() method

The latter thing is more difficult, but I found solution here: Find Hyperlinks in Text using Python (twitter related)

But now I need to implement "Edit Article" function, so I have to do the reverse action on this.

So, how can I transform < a href='http://example.com'>http://example.com< /a> into http://example.com?

Thanks! And I'm sorry for my poor English.

Community
  • 1
  • 1
尤川豪
  • 459
  • 5
  • 26

2 Answers2

5

Sounds like the wrong approach. Making round-trips work correctly is always challenging. Instead, store the source text only, and only format it as HTML when you need to display it. That way, alternate output formats / views (RSS, summaries, etc) are easier to create, too.

Separately, we wonder whether this particular wheel needs to be reinvented again ...

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • 2
    +1. Do not attempt to round-trip between plain text and HTML, that's a nightmare. I assume that the asker is doing this as a learning exercise, rather than because he needs a blog system, and reinventing wheels is often a good learning exercise. – Tom Anderson Aug 06 '11 at 08:25
  • You guys give me really good feedback. I shouldn't do the round-trips work. Now I know I should solve this problem in different way. Thanks! – 尤川豪 Aug 09 '11 at 11:23
2

Since you are using the answer from that other question your links will always be in the same format. So it should be pretty easy using regex. I don't know python, but going by the answer from the last question:

import re

myString = 'This is my tweet check it out <a href="http://tinyurl.com/blah">http://tinyurl.com/blah</a>'

r = re.compile(r'<a href="(http://[^ ]+)">(http://[^ ]+)</a>')
print r.sub(r'\1', myString)

Should work.

Paul
  • 139,544
  • 27
  • 275
  • 264