1

How will you about finding the average inter-cluster distance using dbscan in netlog assuming you have the following code (courtesy of @Nicolas-Payette):

extensions [ dbscan ]

to setup
  clear-all
  ask patches [ set pcolor white ]
  create-turtles 1000 [
    set color black
    set label-color blue
    setxy random-xcor random-ycor
  ]
  ask n-of 5 turtles [
    ask turtles in-radius 3 [
      set color one-of [red grey]
    ]
  ]
end

to find-clusters
  let red-grey-turtles turtles with [ member? color [red grey] ]
  let clusters dbscan:cluster-by-location red-grey-turtles 3 3
  (foreach clusters range length clusters [ [c i] ->
    foreach c [ t ->
      ask t [ set label i ]
    ]
  ])
end

Let us take a the endpoints of the distances being measured to be the centers of the clusters of turtles.

nigus21
  • 337
  • 2
  • 11

1 Answers1

5

This depends on how you define inter-cluster distance. There are a number of ways to do it.

Let us take a the endpoints of the distances being measured to be the centers of the clusters of turtles.

While this is a good technique for, say, K-means clustering, it doesn't work as well for DBSCAN as the clusters can be concave. Thus, the center may be outside of the cluster! Regardless, I'll include as an option.

First, let's define our distance measure:

Mean distance between points in clusters:

to-report cluster-distance [ cluster1 cluster2 ]
  report mean [ mean [ distance myself ] of cluster2 ] of cluster1
end

Minimum distance between points in clusters:

to-report cluster-distance [ cluster1 cluster2 ]
  report min [ min [ distance myself ] of cluster2 ] of cluster1
end

Distance between centroids

Assuming world wrapping is off:

to-report cluster-distance [ cluster1 cluster2 ]
  let x1 mean [ xcor ] of cluster1
  let y1 mean [ ycor ] of cluster1
  let x2 mean [ xcor ] of cluster2
  let y2 mean [ ycor ] of cluster2
  report sqrt ((x1 - x2) ^ 2 + (y1 - y2) ^ 2)
end

If world wrapping is on

; This is super complicated because, with wrapping on, xcor and ycor are
; more like angles rather than cartesian coordinates. So, this converts
; them to angles, gets the mean of those angles, and converts them back.
; Related SO question: https://stackoverflow.com/questions/24786908/get-mean-heading-of-neighboring-turtles
to-report xcor-mean [ xcors ]
  let angles map [ x -> 360 * (x - (min-pxcor - 0.5)) / world-width ] xcors
  let mean-x mean map cos angles
  let mean-y mean map sin angles
  report (atan mean-y mean-x) / 360 * world-width + (min-pxcor - 0.5)
end

to-report ycor-mean [ ycors ]
  let angles map [ y -> 360 * (y - (min-pycor - 0.5)) / world-height ] ycors
  let mean-x mean map cos angles
  let mean-y mean map sin angles
  report (atan mean-y mean-x) / 360 * world-height + (min-pycor - 0.5)
end

to-report cluster-distance [ cluster1 cluster2 ]
  let x1 xcor-mean [ xcor ] of cluster1
  let y1 ycor-mean [ ycor ] of cluster1
  let x2 xcor-mean [ xcor ] of cluster2
  let y2 ycor-mean [ ycor ] of cluster2
  report sqrt ((x1 - x2) ^ 2 + (y1 - y2) ^ 2)
end

Averaging the distances

Once we have a distance measure, getting the average distance is relatively simple using map. Note that the remove below is necessary as we don't want to include a cluster's distance to itself in the mean. Note also that this code is a little inefficient in that it calculates all distances twice, but that also greatly simplifies it:

...
; This line is modified so that we get a list of turtle sets rather than
; a list of lists. 
let clusters map turtle-set dbscan:cluster-by-location red-grey-turtles 3 3
let avg-distance mean map [ c1 ->
  mean map [ c2 ->
    cluster-distance c1 c2
  ] remove c1 clusters ; Get distance to all other clusters but c1
] clusters
Bryan Head
  • 12,360
  • 5
  • 32
  • 50
  • will the max and min (even the average) functions work for more than two clusters distributed in a map? – nigus21 Oct 11 '17 at 17:37
  • Not entirely sure what you're asking. Each of the distance functions measure the distance between two clusters. The averaging stuff then takes the distances from between each combination of cluster. So, unless I'm misunderstanding, yes. – Bryan Head Oct 12 '17 at 01:53
  • Thanks for the brilliant answer, very thorough and helpful. The averaging response is exactly what I needed help understanding. I was wondering if this code only finds the average distance between two cluster centroids, or if it can find the average distance between multiple clusters on the netlogo interface as I hoped? – nigus21 Oct 14 '17 at 17:46
  • How do you find the max and min distance between more than two clusters iteratively? – nigus21 Nov 03 '17 at 15:27
  • The `avg-distance` part is the average distance between *all* clusters, not two. To find max and min, just change the `mean`s in that code to either `max` or `min`. – Bryan Head Nov 05 '17 at 02:45
  • When I implement the code for the distance between cluster centroids with world wrapping on? The report function you gracious gave me: `to-report cluster-distance [ cluster1 cluster2 ] let x1 xcor-mean [ xcor ] of cluster1 let y1 ycor-mean [ ycor ] of cluster1 let x2 xcor-mean [ xcor ] of cluster2 let y2 ycor-mean [ ycor ] of cluster2 report sqrt ((x1 - x2) ^ 2 + (y1 - y2) ^ 2) end`, gives me the following error: `OF expected input to be a turtle agentset or turtle but got the list [(turtle 86) (turtle 243)] instead.` Any thoughts on what the issues might be and how to fix it? – nigus21 Nov 20 '17 at 22:49
  • The distance procedures expect the clusters to be agentsets, but it looks like the dbscan extension returns lists. Just do `turtle-set cluster` to convert `cluster` from a list to an agentset. – Bryan Head Nov 22 '17 at 18:48
  • so where would I put `turtle-set cluster`? Would it be when i call the cluster-distance function? – nigus21 Nov 22 '17 at 18:54
  • You could, but I would put it before that. The idea is the dbscan extension is giving you a list of lists of turtles. But you want a list of turtlesets. So you could do, for instance, `let clusters map turtle-set dbscan:cluster-by-location red-grey-turtles 3 3`. – Bryan Head Nov 22 '17 at 19:01
  • Thanks Bryan, given how my codes is structured i would like to use it `turtle-set` qualifier with the `cluster-distance` ... if i want to do this, do i include it in the reporter? or where the reporter is called and used in the code ... how might the first option you suggested look like? would an example of using it in the reporter be `let x1 xcor-mean [ xcor ] of turtle-set cluster 1 ]`? – nigus21 Nov 22 '17 at 19:42