Imagine I have a dataset as follows:
[{"x":20, "y":50, "attributeA":90, "attributeB":3849},
{"x":34, "y":20, "attributeA":86, "attributeB":5000},
etc.
There could be a bunch more other attributes in addition to these - this is just an example. What I am wondering is, how can I cluster these points based on all of the factors with control over the maximum separation between a given point and the next for a given variable for it to be considered linked. (i.e. euclidean distance must be within 10 points, attributeA
within 5 points and attributeB
within 1000 points)
Any ideas on how to do this in python? As I implied above, I would like to apply euclidean distance to compare distance between the two points if possible - not just comparing x and y as separate attributes. For the rest of the attributes it would be all single dimensional comparison...if that makes sense.
Edit: Just to add some clarity in case this doesn't make sense, basically I am looking for some algorithm to compare all objects with each other (or some more efficient way), if all of object A's attributes and euclidean distance are within the specified threshold when compared to object B, then those two are considered similar and linked - this procedure continues until eventually all the linked clusters can be returned as some clusters will have no points that satisfy the conditions to be similar to any point in another cluster resulting in the clusters being separated.