I am trying to figure out a way of getting Sikuli's image recognition to use within C#. I don't want to use Sikuli itself because its scripting language is a little slow, and because I really don't want to introduce a java bridge in the middle of my .NET C# app.
So, I have a bitmap which represents an area of my screen (I will call this region BUTTON1). The screen layout may have changed slightly, or the screen may have been moved on the desktop -- so I can't use a direct position. I have to first find where the current position of BUTTON1 is within the live screen. (I tried to post pictures of this, but I guess I can't because I am a new user... I hope the description makes it clear...)
I think that Sikuli is using OpenCV under the covers. Since it is open source, I guess I could reverse engineer it, and figure out how to do what they are doing in OpenCV, implementing it in Emgu.CV instead -- but my Java isn't very strong.
I looked for examples showing this, but all of the examples are either extremely simple (ie, how to recognize a stop sign) or very complex (ie how to do facial recognition)... and maybe I am just dense, but I can't seem to make the jump in logic of how to do this.
Also I worry that all of the various image manipulation routines are actually processor intensive, and I really want this as lightweight as possible (in reality I might have lots of buttons and fields I am trying to find on a screen...)
So, the way I am thinking about doing this instead is:
A) Convert the bitmaps to byte arrays and do brute force search. (I know how to do that part). And then
B) Use the byte array position that I found to calculate its screen position (I'm really not completely sure how I do this) instead of using the image processing stuff.
Is that completely crazy? Does anyone have a simple example of how one could use Aforge.Net or Emgu.CV to do this? (Or how to flesh out step B above...?)
Thanks!