Since you would like to apply this to a large number of images, and you already suggested it, let's discuss how to solve this problem by selecting different tiles.
The first step could be to define what "similar" is, so a similarity metric is needed. You already mentioned the tiles' histogram as one source of metric, but there may be many more, for example:
- mean intensity,
- 90th percentile of intensity,
- 10th percentile of intensity,
- mode of intensity, as in peak of the histogram,
- variance of pixel intensity in the whole tile,
- granularity, which you could quickly approximate by the difference between the raw and the Gaussian-filtered image, or by calculating the average variance in small sub-tiles.
If your image has two channels, the above list leaves you already with 12 metric components. Moreover, there are characteristics that you can obtain from the combination of channels, for example the correlation of pixel intensities between channels. With two channels that's only one characteristic, but with three channels it's already three.
To pick different tiles from this high-dimensional cloud, you could consider that some if not many of these metrics will be correlated, so a principal component analysis (PCA) would be a good first step. http://en.wikipedia.org/wiki/Principal_component_analysis
Then, depending on how many sample tiles you would like to chose, you could look at the projection. For seven tiles, for example, I would look at the first three principal components, and chose from the two extremes of each, and then also pick the one tile closest to the center (3 * 2 + 1 = 7).
If you are concerned that chosing from the very extremes of each principal component may not be robust, the 10th and 90th percentiles may be. Alternatively, you could use a clustering algorithm to find separated examples, but this would depend on how your cloud looks like. Good luck.