I'm looking for a way to programatically identify if an image is likely to be a photograph vs an illustration/logo/diagram. The images are always JPEG's so I can't use the format metadata on it's own to differentiate (I've also looked at using the dimensions too but that hasn't helped in the scenario I'm working with where they are all of a similar ratio, they are also typically stripped of camera metadata already).
Specifically I want a way to screen out the <10% of images I come across which are not photographs, though the approach doesn't need to be full proof (if it works ~9 times out of 10, that would be a significant improvement over doing nothing).
I don't mind what programming language or platform a solution uses. It would be ideal to be able to use an existing high level library or an easily implementable (i.e. as few LoC as possible ;) low level approach that could be replicated in multiple languages. I'd also appreciate being pointed at examples of open source projects that do this, even if what they do is hacky.
I haven't had a great deal of luck searching for techniques for doing this. I note a number of search engines provide this option when searching through images, with varying degrees of success.
NB: I'm getting existing images from a range of sources, this is being done for R&D purposes and is in compliance with local copyright laws (before anyone asks).
If there aren't any libraries to do this I might end up writing one (maybe estimating probability based on max unique colours, solid blocks of colour, etc) but I'm hoping someone has published something useable for this already and I just haven't found it!