0

I had a set of coordinates and plotted it as scatter graph. Then DBSCAN was applied to the points so the points that are closed together form a cluster (Note: the black dots are being classified as noise).

enter image description here

Now I wanted to find the centroid of the cluster (by adding the x & y coordinates together and average it out). I am able to obtain the set of coordinates (that is grouped into its respective cluster) and it looks like this:

[[10. 75.]
 [11. 74.]
 [11. 75.]
 [12. 73.]
 [12. 74.]
 [12. 75.]]
[[34. 49.]
 [34. 50.]
 [35. 48.]
 [35. 49.]
 [35. 50.]
 [36. 48.]
 [36. 49.]
 [36. 50.]]
[[43. 78.]
 [43. 79.]
 [43. 80.]
 [43. 81.]
 [44. 78.]
 [44. 79.]
 [44. 80.]
 [44. 81.]
 [45. 78.]
 [45. 79.]
 [45. 80.]
 [45. 81.]
 [46. 78.]
 [46. 79.]
 [46. 80.]
 [46. 81.]
 [47. 78.]
 [47. 79.]
 [47. 80.]
 [47. 81.]
 [48. 79.]
 [48. 80.]]
[[53. 63.]
 [53. 64.]
 [53. 65.]
 [54. 63.]
 [54. 64.]
 [54. 65.]
 [54. 66.]
 [55. 63.]
 [55. 64.]
 [55. 65.]
 [55. 66.]
 [56. 63.]
 [56. 64.]
 [56. 65.]
 [56. 66.]]
[[ 72. 115.]
 [ 73. 114.]
 [ 73. 115.]
 [ 73. 116.]
 [ 73. 117.]
 [ 73. 118.]
 [ 74. 113.]
 [ 74. 114.]
 [ 74. 115.]
 [ 74. 116.]
 [ 75. 113.]
 [ 75. 114.]
 [ 75. 115.]
 [ 75. 116.]
 [ 75. 117.]
 [ 76. 113.]
 [ 76. 114.]
 [ 76. 115.]
 [ 76. 116.]
 [ 76. 117.]
 [ 77. 115.]
 [ 77. 116.]]
[[79. 56.]
 [79. 57.]
 [79. 58.]
 [79. 59.]
 [80. 57.]
 [80. 58.]
 [80. 59.]]

...where each [[ & ]] marks the start and end of a cluster. So I am wondering if there is a way that I can save the set of cluster coordinates in a way that each group of cluster coordinates are saved seperately into its individual txt file (to make it easier to perform the averaging calculation later)?

Yu.L
  • 49
  • 2
  • 7
  • If these are numpy arrays (as opposed to python lists) you might look at [`np.save`/`np.load`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.save.html) (see: https://stackoverflow.com/questions/51837627/why-does-pickle-take-so-much-longer-than-np-save/51845434#51845434) – wbadart Apr 17 '19 at 14:59

1 Answers1

0

If you have the following 2 clusters:

clusters_2d = [
    [
        [10, 75,],
        [11, 74,],
        [11, 75,],
        [12, 73,],
        [12, 74,],
        [12, 75,],
    ],
    [
        [34, 49,],
        [34, 50,],
        [35, 48,],
        [35, 49,],
        [35, 50,],
        [36, 48,],
        [36, 49,],
        [36, 50,],
    ]
]

In order to save them as separate files you can iterate over the coordinates:

Using pickle

import pickle

for i, x in enumerate(clusters_2d):
    pickle.dump( x, open( f"cluster_{i}.pkl", "wb" ) )

Then in order to read one of those files:

cluster1 = pickle.load( open( "cluster_0.pkl", "rb" ) )

Using txt

for i, x in enumerate(2d_clusters):
    with open(f'cluster_{i}.txt', 'w') as f:
        for item in x:
            f.write(f"{item}\n")

You can either save them in a pickle file or txt or whatever you like.

VnC
  • 1,936
  • 16
  • 26