0

I have am using a program that recieves weather information from Yahoo servcies, in this specific case, weather information for Lisbon (Portugal).

It is imperative that my program receives a set of numbers. However, I am receiving the content in HTML:

<![CDATA[
<img src="http://l.yimg.com/a/i/us/we/52/30.gif"/><br />
<b>Current Conditions:</b><br />
Partly Cloudy, 8 C<BR />
<BR /><b>Forecast:</b><BR />
Wed - Sunny. High: 14 Low: 6<br />
Thu - Sunny. High: 12 Low: 8<br />
Fri - AM Showers. High: 14 Low: 6<br />
Sat - Sunny. High: 15 Low: 7<br />
Sun - Sunny. High: 12 Low: 7<br />
<br />
<a href="http://us.rd.yahoo.com/dailynews/rss/weather/Lisbon__PT/*http://weather.yahoo.com/forecast/POXX0016_c.html">Full Forecast at Yahoo! Weather</a><BR/><BR/>
(provided by <a href="http://www.weather.com" >The Weather Channel</a>)<br/>
]]>

Therefore I have the following questions:

  1. Is there any regular expression that can help me select only the numbers for Wed - Sunny. High: 14 Low: 6<br /> temperature?
  2. If 1 cannot be done, are regular expressions just not strong enough for this type of work? 3.If they are not, is there any regular expression that only gives me all the numbers in the file? (The numbers are all I care).

Thanks in advance, Pedro.

Flame_Phoenix
  • 16,489
  • 37
  • 131
  • 266
  • If you use .NET, use [HtmlAgilityPack](http://www.nuget.org/packages/HtmlAgilityPack) (or some other html parser). Forget about regular expressions on HTML. – Alex Filipovici Nov 27 '13 at 12:40
  • If you haven't read [this answer](http://stackoverflow.com/a/1732454/785745), please do. And then get a proper parser. – Kendall Frey Nov 27 '13 at 12:44
  • The numbers are in the tags right below the description. It's easier to extract them from there with an XML parser. – JJJ Nov 27 '13 at 12:45
  • Also, you should mention which programming language you're using. – JJJ Nov 27 '13 at 12:46
  • You can request the API to provide the response in different format by changing the accept header to `application/xml` or something. Tag with a programming language for further help. – Mat J Nov 27 '13 at 12:54
  • I can't use parsers, that is my big problem. I am literaly forced to use a Regex due to project constraints, that is why I posted this here. – Flame_Phoenix Nov 27 '13 at 14:01
  • possible duplicate of [RegEx match open tags except XHTML self-contained tags](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – Jeremy J Starcher Mar 07 '14 at 18:10

1 Answers1

1

Groups 1 and 2 from this regex contain the two numbers for Wednesday:

^Wed.*?High: (\d+) Low: (\d+)

See a live demo of this regex working with your example.

Bohemian
  • 412,405
  • 93
  • 575
  • 722