Python RegEx for this HTML String

Question

I've got a string which is like that:

<span class=\"market_listing_price market_listing_price_with_fee\">\r
\t\t\t\t\t&#36;92.53 USD\t\t\t\t<\/span>

I need to find this string via RegEx. My try:

(^<span class=\\"market_listing_price market_listing_price_with_fee\\">\\r\\t\\t\\t\\t\\t&)

But my problem is, the count of "\t" and "\r" may vary.. And of course this is not the Regular Expression for the whole string.. Only for a part of it.

So, what's the correct and full RegEx for this string?

Don't do that. Don't parse html with regex. Try some html parsers like beautifulsoup — nu11p01n73R, Jun 09 '15 at 13:53

score 0 · Answer 1 · edited May 23 '17 at 10:26

0

Since this is an HTML string, I would suggest using an HTML Parser like BeautifulSoup.

Here is an example approach finding the element by class attribute value using a CSS selector:

from bs4 import BeautifulSoup

data = "my HTML data" 

soup = BeautifulSoup(data)
result = soup.select("span.market_listing_price.market_listing_price_with_fee")

See also:

RegEx match open tags except XHTML self-contained tags

edited May 23 '17 at 10:26

Community

1
1

answered Jun 09 '15 at 13:53

alecxe

462,703
120
1,088
1,195

Thank you! I'll use this function in my next project! – Redfox Jun 09 '15 at 20:16

score 0 · Accepted Answer · answered Jun 09 '15 at 14:03

0

Answering your question about the Regex:

"market_listing_price market_listing_price_with_fee\\">[\\r]*[\\t]*&

This will catch the string you need. Even if you add more \t's or \r's. If you need to edit this Regex I advice you to visit this website and test-modify it. It will also help you to understand how regular expression works and build your own complete RegEx.

answered Jun 09 '15 at 14:03

BlackM

3,927
8
39
69

Thank you! This was exactly what I've been searching for. – Redfox Jun 09 '15 at 20:16

Python RegEx for this HTML String

2 Answers2