The way I understand your question, you want to find boxes that enclose clouds of data points.
You define your granularity criterion as the number of boxes used to describe your data set.
I think what you are looking for is agglomerative hierarchical clustering. The algorithm is quite straight forward. Let n be the number of data points you have in the set. Basically, the algorithm starts by considering n groups, each one being populated by a single point. Then, it is an iterative process :
- Merge the two closest groups according to a distance criterion
- Since the groups set has changed, update the distances between the groups
- Back to the merge step until either you reached a specific number of clusters or a specific distance threshold
You can also build the dendogram. It is a tree-like structure that will store the history of all the merging process, allowing you to retrieve any level of granularity between 1 cluster and n clusters.
There is a set of functions in Scipy that are dedicated to this algorithm. It is covered by the question Tutorial for scipy.cluster.hierarchy.
Getting the clusters is the first step, now you can build your boxes. Lets cover this in a so-called mathematical point of view. Let C be a cluster and P1, ... Pn the points of the cluster. If a rectangular box is fine, then it can be defined by the two points of coordinates (xmin, ymin) and (xmax, ymax), with :
- xmin = min (P.x P ∈ C)
- ymin = min (P.y P ∈ C )
- xmax = max (P.x P ∈ C )
- xmax = max (P.y P ∈ C )
EDIT :
This way of building the boxes is the dumbest possible. If you want something that really fits, you'll have to look on building the convex hull of each cluster.