9

This is my first post on StackOverflow, so apologies if it's lacking the right information.

Scenario.

I'm in the process of moving away from the Google Weather API to BOM (Australia) weather service. I've managed to get the weather data from BOM just fine using streamreaders etc, but what I'm stuck on is the image icon that matches the daily forecast.

What I did with the old Google Weather API was quite brutal yet did the trick. The Google Weather API only gave off a couple of different type of forecasts that I could jam together into a string that i could in turn use in an imageURL.

Example of what I did with the Google Weather API...

imageDay1.ImageUrl = "images/weather/" + lbWeatherDay1Cond.Text.Replace(" ", string.Empty) + ".png";

"Mostly sunny" = mostlysunny.png

"Sunny" = sunny.png

"Chance of Rain" = chanceofrain.png

"Showers" = showers.png

"Partly cloudy" = partlycloudy.png

There was on say 15 different possible options for the daily forecast.

The problems I have now and with BOM (Australia Weather Service) is this...

Possible morning shower

Shower or two, clearing later

So many thousands more.... there is no standard.

What I'm hoping is that it is possible is some of the great minds on here to create a string from a keyword within this string? Something like "Showers" for "Showers.png" or something a little more complex to recognise "Chance of Showers" as "Chanceshowers.jpg" while keeping "Shower or two" as "Showers.png".

I'm easy to any ideas or solutions (hopefully in c#). As long as it's very lightweight (the process has to be repeated for the 5 day forecast) and can capture almost any scenario...

At this point of time, I'm carrying on with String.Replace, after String.Replace, after String.Replace option.... It will do for now, but I can't roll it into production like this.

Cheers all!

Trent

Community
  • 1
  • 1
Trent Steenholdt
  • 151
  • 1
  • 1
  • 9
  • 3
    Did you try with a **look-up table** (you'll extract the first match) where the key is a regular expressions? If there is no standard you can't rely on a well defined algorithm. A cute (and more robust) option may be a **Bayesian** algorithm (if you really can't manage this in any other way). – Adriano Repetti Sep 20 '12 at 11:38
  • 1
    Arianao, thanks for the sugesstion however I'm not the greatest coder in the world and would have no idea how to even begin with a look-up table or a Bayesian algorithm. If you can possibly point me to some MSDN articles etc, I'm sure I could learn it pretty quickly though :). Thanks for the help! – Trent Steenholdt Sep 20 '12 at 11:44
  • @Trent You will need to implement a `Dictionary` where the key string is a **Regular Expression** and the value string is the name of the corresponding image. You will test your input against all the Regular Expressions in this dictionary and for the first one that matches, you will get the corresponding image value. Regex: http://msdn.microsoft.com/en-us/library/ms228595(v=vs.80).aspx – Rotem Sep 20 '12 at 11:50
  • 3
    @Rotem, a dictionary is unordered, so "first one that matches" would be non-deterministic if a string matched more than one regex. It'd be better to use an ordered list, e.g. `List>`. – Joe White Sep 20 '12 at 11:51
  • 1
    Take a look to [this post on SO](http://stackoverflow.com/questions/3724472/looking-for-open-source-naive-bayesian-classifier-in-c-sharp-for-a-twitter-senti) for Bayesian (do not forget you do not need something very good). look-up table is just a...list, take a look to any example of Regex class on .NET. – Adriano Repetti Sep 20 '12 at 11:52
  • @JoeWhite - Is this on the right track? `List> weatherCollection = new List>(); weatherCollection.Add(Tuple.Create("(.*)showers(.*)", "Showers.png"));` – Trent Steenholdt Sep 20 '12 at 12:06
  • @TrentSteenholdt, you don't need the parentheses (unless you care about later finding out what was before and after the match), and you don't need the `.*` at the beginning and end anyway, since regexes look for substring matches instead of whole-string matches. But yes, you're on the right track. – Joe White Sep 20 '12 at 12:33
  • You might think about this slightly differently. How many possible images are you going to show? Based on those images, you should be able to categorize specific keywords in order to filter down the appropriate image. – NotMe Sep 24 '12 at 20:23

3 Answers3

3

I noticed in the comments you're trying out the regex lookup table, which just might do well enough to solve the problem. However, I'm going to expand on what Adriano mentioned about a more robust Bayesian solution.

This is a problem that's related to machine learning and AI. It involves some Natural Language Processing, like how Google tries to interpret what users ask it, or how mail spam filters work.

A simple and interesting system is described by Sebastian Thrun in the following videos that were part of an online course. It begins describing a basic method by which an algorithm can learn to classify a collection of words (such as from an email) as "Spam" or "Not Spam".

(Most of the videos are really short.)

  1. Spam Detection - Quiz Answer
  2. Probability of Spam - Quiz Answer
  3. Maximum Likelihood - Quiz Answer
  4. Relationship to Bayes Networks - Quiz Answer
  5. Classification Quiz - Quiz Answer
  6. Classification 2 Quiz - Quiz Answer
  7. Classification 3 Quiz, a contrived example
  8. Quiz Answer & Laplace Smoothing - Quiz Answer
  9. Smoothed Classification Quiz - Quiz Answer
  10. Final Quiz - Quiz Answer

This Bayesian method is robust against dynamic input and is reasonably quick at learning. Then, after consuming enough training data, you would only need to save a lookup table of probabilities and do a series of arithmetic computations at runtime.

With this foundation, you could apply the same method to work for multiple classifications, e.g. one for each weather image.

Kache
  • 15,647
  • 12
  • 51
  • 79
1

If you're already capturing the webpage, couldn't you just capture the segment where they put the picture in and get the image that way? If there's plaintext of "partly sunny", you could just capture that division as well and just use your own pictures. A Bayesian net just to scrape weather sounds incredibly painful.

user1701047
  • 737
  • 5
  • 7
-3
$api_string = "Mostly sunny"; 
$image = "default.png";

switch($api_string)
{
    case "Mostly sunny":
        $image = "mostlysunny.png";
    break;
    case "showers":
        $image = "showers.png"
    break;
}

etc

Ben
  • 5,627
  • 9
  • 35
  • 49