1571

This was bugging me over the weekend: What is a good way to solve those Where's Waldo? ['Wally' outside of North America] puzzles, using Mathematica (image-processing and other functionality)?

Here is what I have so far, a function which reduces the visual complexity a little bit by dimming some of the non-red colors:

whereIsWaldo[url_] := Module[{waldo, waldo2, waldoMask},
    waldo = Import[url];
    waldo2 = Image[ImageData[
        waldo] /. {{r_, g_, b_} /;
          Not[r > .7 && g < .3 && b < .3] :> {0, 0,
          0}, {r_, g_, b_} /; (r > .7 && g < .3 && b < .3) :> {1, 1,
          1}}];
    waldoMask = Closing[waldo2, 4];
    ImageCompose[waldo, {waldoMask, .5}]
]

And an example of a URL where this 'works':

whereIsWaldo["http://www.findwaldo.com/fankit/graphics/IntlManOfLiterature/Scenes/DepartmentStore.jpg"]

(Waldo is by the cash register):

Original image

Mathematica graphic

Boann
  • 48,794
  • 16
  • 117
  • 146
Arnoud Buzing
  • 15,383
  • 3
  • 20
  • 50
  • 1
    Check out this Meta post: http://meta.stackexchange.com/questions/116401/stack-overflow-mentioned-on-nprs-wait-wait-dont-tell-me-and-ny-times – Bill the Lizard Dec 18 '11 at 15:06
  • 8
    As a PhD student in computer vision I am sooo tempted to give this a shot... but I must resist. For what it's worth I'd go for Histogram of Oriented Gradients + sliding window SVM, as in [this](http://lear.inrialpes.fr/people/triggs/pubs/Dalal-cvpr05.pdf) very influential work (warning: pdf). – dimatura Dec 19 '11 at 03:48
  • 2
    Can we change the question to support other languages as well? I thought about doing it with Matlab – Andrey Rubshtein Jan 09 '12 at 13:51
  • 2
    @ArnoudBuzing: In your question, you could find Waldo by looking at the selection which has the most white in it. :/ – Tamara Wijsman Mar 22 '12 at 13:17
  • 1
    Just FYI Waldo can't be seen in your image due to compression and low resolution. – alecail Jul 01 '13 at 12:24
  • See http://www.smbc-comics.com/?id=3222 (in the Saturday Morning Breakfast Cereal webcomic). In that strip, the where is Waldo/Wally puzzle is solved by "facial recognition software". – b_jonas Jan 08 '16 at 11:30

5 Answers5

1655

I've found Waldo!

waldo had been found

How I've done it

First, I'm filtering out all colours that aren't red

waldo = Import["http://www.findwaldo.com/fankit/graphics/IntlManOfLiterature/Scenes/DepartmentStore.jpg"];
red = Fold[ImageSubtract, #[[1]], Rest[#]] &@ColorSeparate[waldo];

Next, I'm calculating the correlation of this image with a simple black and white pattern to find the red and white transitions in the shirt.

corr = ImageCorrelate[red, 
   Image@Join[ConstantArray[1, {2, 4}], ConstantArray[0, {2, 4}]], 
   NormalizedSquaredEuclideanDistance];

I use Binarize to pick out the pixels in the image with a sufficiently high correlation and draw white circle around them to emphasize them using Dilation

pos = Dilation[ColorNegate[Binarize[corr, .12]], DiskMatrix[30]];

I had to play around a little with the level. If the level is too high, too many false positives are picked out.

Finally I'm combining this result with the original image to get the result above

found = ImageMultiply[waldo, ImageAdd[ColorConvert[pos, "GrayLevel"], .5]]
Heike
  • 24,102
  • 2
  • 31
  • 45
  • 6
    Have you considered using Waldo as the template instead of red and white stripes? Since many times, the red-white stripes are used to confuse the search :) – mevatron Dec 12 '11 at 19:48
  • 1
    @mevatron I'm sure that there are more sophisticated ways to find him like using his head. The black and white pattern was the simplest I could think off and it worked straight away so I didn't try any others. – Heike Dec 12 '11 at 19:50
  • 1
    @Heike: If you don't mind, I posted it in the "What is in your Mathematica tool bag?" question (http://stackoverflow.com/a/8480026/312124) – Mike Bailey Dec 12 '11 at 19:52
  • 53
    @MikeBantegui While Heike's solution is great, I wouldn't be so quick to package it into a `WhereIsWaldo` function, as it is not a general solution. Heike herself has pointed out that the levels need to be played around with before you can get a positive. To see what I mean, try your packaged function as is on `"http://www.findwaldo.com/fankit/graphics/IntlManOfLiterature/Scenes/AtTheBeach.jpg"` It's harder with this one. – abcd Dec 12 '11 at 19:57
  • @yoda: Damn, you're right. Probably would help to take into account his general features. – Mike Bailey Dec 12 '11 at 19:59
  • 18
    This image is trickier: [Waldo](http://www.findwaldo.com/fankit/graphics/IntlManOfLiterature/Scenes/TheGobblingGluttons.jpg). I think though, that having something that can highlight potential Waldos is still useful (for some definition of 'useful'.) (This reminds me of some of the things iPhoto will sometimes identify as a face in our photo collection...) – Brett Champion Dec 12 '11 at 20:21
  • 2
    @Brett I shudder imagining those things... ;-) – Sjoerd C. de Vries Dec 17 '11 at 00:21
  • 34
    Please see this Meta post: http://meta.stackexchange.com/questions/116401/stack-overflow-mentioned-on-nprs-wait-wait-dont-tell-me-and-ny-times – Bill the Lizard Dec 18 '11 at 15:06
  • 156
    You seem to have misunderstood the rules of Where's Waldo. This is *clearly* cheating. – Stefan Kendall Dec 19 '11 at 00:37
  • 2
    And then you see your "algorithmic solving" hell: http://trezoid.com/gubbins/WallyHell.jpg (The real one is actually bottom right in that image) – Trezoid Dec 19 '11 at 01:25
  • @Trezoid what's the difference? – Ricardo Tomasi Dec 19 '11 at 01:30
  • @RicardoTomasi The fact that all those methods will essentially just circle the whole image, not finding anything useful... – Trezoid Dec 19 '11 at 01:57
  • 1
    How fast is Mathemathica for this? Compared to something you would write on your own from scratch. – Rob Fox Dec 19 '11 at 07:16
  • 1
    That's a pretty simple image because I can nly notice two other people wearing white-red stripped shirts. Awesome none the less! – Tudor Dec 19 '11 at 09:37
  • 3
    I don't have a Mathematica license and 300 euros just to play a bit with the image processing toolkit seems to be a bit expensive.. maybe I'll try later using opencv. – Nils Dec 19 '11 at 09:40
  • 1
    @Trezoid I mean what's the difference from Waldo to the other Waldos in that image? – Ricardo Tomasi Dec 19 '11 at 11:16
  • 91
    While this is a nice hack, it just doesn't work. It requires manual tuning and only works on one image. I don't understand why this is upvoted and even chosen as an answer. It discourages anyone else from even trying to answer with better working methods. – sam hocevar Dec 19 '11 at 16:18
  • 1
    Truly inspiring, But the Only problem is you need to find where waldo is and then adjust levels to circle it !!! – Prashant Bhate Jan 02 '12 at 14:17
  • 6
    Hm . . . that's not Waldo. Waldo is dressed as a vacuum salesman (you have to look closely, but that's him). – iND Jan 10 '12 at 08:24
  • 16
    As a Waldo, myself, I approve of this answer – gWaldo Mar 21 '12 at 15:59
  • 1
    Thought this would be of interest: http://articles.nydailynews.com/2012-03-24/news/31235128_1_waldo-books-algorithm-programmer – Daniel Lichtblau Mar 30 '12 at 18:00
  • Wouldn't the correlation be improved (and be more genereal) if it was done using a picture with red and white stripes like a cut-out of waldos shirt or a cut-out of a typical image of Waldo? Unfortunatly I do not know Mathematica so I can not try this out – Yet Another Geek Apr 01 '12 at 14:38
  • 1
    Damn it, I thought we had a while before the computers beat us in an eye test – James Oct 31 '12 at 11:09
147

My guess at a "bulletproof way to do this" (think CIA finding Waldo in any satellite image any time, not just a single image without competing elements, like striped shirts)... I would train a Boltzmann machine on many images of Waldo - all variations of him sitting, standing, occluded, etc.; shirt, hat, camera, and all the works. You don't need a large corpus of Waldos (maybe 3-5 will be enough), but the more the better.

This will assign clouds of probabilities to various elements occurring in whatever the correct arrangement, and then establish (via segmentation) what an average object size is, fragment the source image into cells of objects which most resemble individual people (considering possible occlusions and pose changes), but since Waldo pictures usually include a LOT of people at about the same scale, this should be a very easy task, then feed these segments of the pre-trained Boltzmann machine. It will give you probability of each one being Waldo. Take one with the highest probability.

This is how OCR, ZIP code readers, and strokeless handwriting recognition work today. Basically you know the answer is there, you know more or less what it should look like, and everything else may have common elements, but is definitely "not it", so you don't bother with the "not it"s, you just look of the likelihood of "it" among all possible "it"s you've seen before" (in ZIP codes for example, you'd train BM for just 1s, just 2s, just 3s, etc., then feed each digit to each machine, and pick one that has most confidence). This works a lot better than a single neural network learning features of all numbers.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Gregory Klopper
  • 2,285
  • 1
  • 14
  • 14
  • 14
    Aren't just plain neural networks enough for that? Besides, the wikipedia article claims that Boltzmann machines are not practical. – GClaramunt Dec 19 '11 at 14:44
  • 2
    Without trying I'm not sure, but if large enough and complex enough a neural network ought to be sufficient for ANYTHING. Especially with recurrencies. Boltzmann machines do VERY VERY VERY well for recognizing a fairly simplistic set of data with high amount of noise in a sea of data unlike itself. – Gregory Klopper Dec 20 '11 at 05:00
  • 14
    ZIP codes are read with Boltzmann machines all the time, and accuracy of mail delivery has gone through the roof. – Gregory Klopper Dec 20 '11 at 05:00
47

I agree with @GregoryKlopper that the right way to solve the general problem of finding Waldo (or any object of interest) in an arbitrary image would be to train a supervised machine learning classifier. Using many positive and negative labeled examples, an algorithm such as Support Vector Machine, Boosted Decision Stump or Boltzmann Machine could likely be trained to achieve high accuracy on this problem. Mathematica even includes these algorithms in its Machine Learning Framework.

The two challenges with training a Waldo classifier would be:

  1. Determining the right image feature transform. This is where @Heike's answer would be useful: a red filter and a stripped pattern detector (e.g., wavelet or DCT decomposition) would be a good way to turn raw pixels into a format that the classification algorithm could learn from. A block-based decomposition that assesses all subsections of the image would also be required ... but this is made easier by the fact that Waldo is a) always roughly the same size and b) always present exactly once in each image.
  2. Obtaining enough training examples. SVMs work best with at least 100 examples of each class. Commercial applications of boosting (e.g., the face-focusing in digital cameras) are trained on millions of positive and negative examples.

A quick Google image search turns up some good data -- I'm going to have a go at collecting some training examples and coding this up right now!

However, even a machine learning approach (or the rule-based approach suggested by @iND) will struggle for an image like the Land of Waldos!

lubar
  • 2,589
  • 2
  • 26
  • 28
  • A machine learning-based computer vision system that tries to solve the "Where's Waldo" problem in the real world (i.e., finding a particular person in crowd photos on Flickr) was presented at Computer Vision and Pattern Recognition conference last year. They cheat a little though by adding some 3D location info by using multiple photos of the same scene. – lubar Apr 01 '12 at 01:24
41

I don't know Mathematica . . . too bad. But I like the answer above, for the most part.

Still there is a major flaw in relying on the stripes alone to glean the answer (I personally don't have a problem with one manual adjustment). There is an example (listed by Brett Champion, here) presented which shows that they, at times, break up the shirt pattern. So then it becomes a more complex pattern.

I would try an approach of shape id and colors, along with spacial relations. Much like face recognition, you could look for geometric patterns at certain ratios from each other. The caveat is that usually one or more of those shapes is occluded.

Get a white balance on the image, and red a red balance from the image. I believe Waldo is always the same value/hue, but the image may be from a scan, or a bad copy. Then always refer to an array of the colors that Waldo actually is: red, white, dark brown, blue, peach, {shoe color}.

There is a shirt pattern, and also the pants, glasses, hair, face, shoes and hat that define Waldo. Also, relative to other people in the image, Waldo is on the skinny side.

So, find random people to obtain an the height of people in this pic. Measure the average height of a bunch of things at random points in the image (a simple outline will produce quite a few individual people). If each thing is not within some standard deviation from each other, they are ignored for now. Compare the average of heights to the image's height. If the ratio is too great (e.g., 1:2, 1:4, or similarly close), then try again. Run it 10(?) of times to make sure that the samples are all pretty close together, excluding any average that is outside some standard deviation. Possible in Mathematica?

This is your Waldo size. Walso is skinny, so you are looking for something 5:1 or 6:1 (or whatever) ht:wd. However, this is not sufficient. If Waldo is partially hidden, the height could change. So, you are looking for a block of red-white that ~2:1. But there has to be more indicators.

  1. Waldo has glasses. Search for two circles 0.5:1 above the red-white.
  2. Blue pants. Any amount of blue at the same width within any distance between the end of the red-white and the distance to his feet. Note that he wears his shirt short, so the feet are not too close.
  3. The hat. Red-white any distance up to twice the top of his head. Note that it must have dark hair below, and probably glasses.
  4. Long sleeves. red-white at some angle from the main red-white.
  5. Dark hair.
  6. Shoe color. I don't know the color.

Any of those could apply. These are also negative checks against similar people in the pic -- e.g., #2 negates wearing a red-white apron (too close to shoes), #5 eliminates light colored hair. Also, shape is only one indicator for each of these tests . . . color alone within the specified distance can give good results.

This will narrow down the areas to process.

Storing these results will produce a set of areas that should have Waldo in it. Exclude all other areas (e.g., for each area, select a circle twice as big as the average person size), and then run the process that @Heike laid out with removing all but red, and so on.

Any thoughts on how to code this?


Edit:

Thoughts on how to code this . . . exclude all areas but Waldo red, skeletonize the red areas, and prune them down to a single point. Do the same for Waldo hair brown, Waldo pants blue, Waldo shoe color. For Waldo skin color, exclude, then find the outline.

Next, exclude non-red, dilate (a lot) all the red areas, then skeletonize and prune. This part will give a list of possible Waldo center points. This will be the marker to compare all other Waldo color sections to.

From here, using the skeletonized red areas (not the dilated ones), count the lines in each area. If there is the correct number (four, right?), this is certainly a possible area. If not, I guess just exclude it (as being a Waldo center . . . it may still be his hat).

Then check if there is a face shape above, a hair point above, pants point below, shoe points below, and so on.

No code yet -- still reading the docs.

iND
  • 2,663
  • 1
  • 16
  • 36
  • 9
    Perhaps you can show a proof of concept in whichever system/language you are familiar with. This will also give you a feel for where difficulties might come in. – Szabolcs Jan 11 '12 at 08:55
  • 1
    Oh, I'm just enjoying the challenge as it stands. It gives me something to do in between walks on the beach and dressing for dinner. – iND Jan 11 '12 at 18:09
  • 1
    So. . . why the downvotes? How is this different than the other speculative answer here? Is this a suggestion that this question should be taken more seriously? Or just that I should seem more serious in my investigation? Is my approach actually wrong? – iND Jan 12 '12 at 05:54
  • 3
    I did not downvote you and I do not think downvotes are appropriate for honest attempts to answer (unless they give misinformation). The most probable reason for the downvotes is that you did not seem to have tried out the (quite complicated sounding) approach, and finding a good solution would probably take a good amount of practical experimentation and ruling out many ideas. The other speculative answer suggests a *general* method (as a starting point) that has been used in the past for similar problems, and there's a good amount of literature on it. Just trying to explain what happened. – Szabolcs Jan 12 '12 at 10:49
  • Thanks for the explanation. I guess I am not focusing on the history of the ideas. – iND Jan 12 '12 at 18:15
  • your idea fails if waldo is doing a handstand. – Jason Mar 22 '12 at 23:59
  • Not really. . . "above" is a direction, so the position/spacing should determine which direction the rest are expected to be in. Probably. – iND Mar 29 '12 at 04:34
7

I have a quick solution for finding Waldo using OpenCV.

I used the template matching function available in OpenCV to find Waldo.

To do this a template is needed. So I cropped Waldo from the original image and used it as a template.

enter image description here

Next I called the cv2.matchTemplate() function along with the normalized correlation coefficient as the method used. It returned a high probability at a single region as shown in white below (somewhere in the top left region):

enter image description here

The position of the highest probable region was found using cv2.minMaxLoc() function, which I then used to draw the rectangle to highlight Waldo:

enter image description here

Jeru Luke
  • 20,118
  • 13
  • 80
  • 87
  • 9
    Trying to tackle SO's most famous image processing questions ? ;) Your solution is nice and easy but a/ only works for this specific image and b/ needs the exact image of Waldo you want to find beforehand, while I think the question was about finding any Waldo in any "Where's Waldo image" like you would play the normal game : without knowing what he looks like beforehand. This question is a lot of fun anyhow – Soltius Apr 11 '17 at 11:16
  • 1
    @Solitus ha exactly !!! I worked it only for this image in particular. Working it for different images would be a challenge though !! – Jeru Luke Apr 11 '17 at 11:24