Python Regex 'not' to identify pattern within

Question

I am dealing a problem to write a python regex 'not'to identify a certain pattern within href tags.

My aim is to replace all occurrences of DSS[a-z]{2}[0-9]{2} with a href link as shown below,but without replacing the same pattern occurring inside href tags

Present Regex:

replaced = re.sub("[^http://*/s](DSS[a-z]{2}[0-9]{2})", "<a href=\"http://test.com=\\1\">\\1</a>", input)

I need to add this new regex using an OR operator to the existing one I have

EDIT:

I am trying to use regex just for a simple operation. I want to replace the occurrences of the pattern anywhere in the html using a regex except occurring within<a><\a>.

possible duplicate of [Python Find & Replace Beautiful Soup](http://stackoverflow.com/questions/6674310/python-find-replace-beautiful-soup) — , Jul 13 '11 at 15:37
What exactly are you trying to accomplish with `[^http://*/s]`? This isn't making any sense. — Tim Pietzcker, Jul 13 '11 at 15:41
I am trying not to match the pattern when it is inside a http:// link — c_prog_90, Jul 13 '11 at 15:54
@thinkcool: Regexes cannot reliably do this, even if you think it's a simple operation. People won't tell you how to do it with regex, because regex is not the right tool for the job. It gets asked again and again, which is why e-satis linked a standard answer. If you're handling HTML, use an HTML parser. — Thomas K, Jul 13 '11 at 16:53

score 3 · Accepted Answer · edited May 23 '17 at 10:34

3

The answer to any question having regexp and HTML in the same sentence is here.

In Python, the best HTML parser is indeed Beautilf Soup.

If you want to persist with regexp, you can try a negative lookbehind to avoid anything precessed by a ". At your own risk.

edited May 23 '17 at 10:34

Community

1
1

answered Jul 13 '11 at 15:42

Bite code

578,959
113
301
329

Well.... http://stackoverflow.com/questions/4231382/regular-expression-pattern-not-matching-anywhere-in-string/4234491#4234491 – Richard H Jul 13 '11 at 17:21
Lol. Funny, but let's save that for regex master with very special edge cases. – Bite code Jul 13 '11 at 23:12

Python Regex 'not' to identify pattern within

1 Answers1