Can someone please help modify the working example below to create clusters from the shared data?
The example uses Mean Shift clustering from Scikit-Learn to identify patches of similar/co-located plant species in an agronomical facility.
Similar questions about using categorical values in addition to the numeric values in these kinds of problems have been asked before, but I think this example is different for the following reason: The non-numeric values in this problem cannot be simply encoded with one and zero dummy values. For example, we can't One-Hot encode values like 'Aristolochia macrophylla' and 'Aristolochia durior' because species with this kind of similarity in their names need to be clustered together based on their family, in addition to their geographic proximity as given by the X and Y values. The similarity of the name is just as important as the location when creating the clusters.
I've tried two things: assigning arbitrary numeric values to the letters in the species name to show that a names with similar spelling would be closer together on a number line. I was going to apply auto-scaling to the values and plug into the script with the X and Y coordinates. This doesn't work because different names ended up very similar numerically.
My other attempt to incorporate the categorical values was through using the Levenstein distance. But the output of the distance is based on comparing only two values. And if you make an output showing the distance of each string to all the others, how can you implement that result as an input for the Meanshift algorithm?
Anyway, here is the data and working script that uses just the numeric values for now. I would really appreciate any examples of how to cluster this data using the similarity of the categorical values as well.
Thank you
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from itertools import cycle
from sklearn.cluster import MeanShift, estimate_bandwidth
from sklearn.datasets.samples_generator import make_blobs
df=pd.DataFrame()
df["POINT_X"]=[-75.933169765,-75.932900302,-75.933060039,-75.932456135,-75.932334122,-75.933383845,-75.933378563,-75.933290334,-75.933302506,-75.932024669,-75.931803297,-75.931777655,-75.9317845,-75.931807731,
-75.931794839,-75.932045113,-75.932165473,-75.932763574,-75.93216276,-75.932066326,-75.931934871,-75.932294115,-75.931852284,-75.93187799,-75.932063549,-75.932377939,-75.932466697,-75.9324484,-75.932523695,
-75.932484492,-75.931882652,-75.932006344,-75.932228988,-75.932702486,-75.933245229,-75.933165385,-75.932990797,-75.932741398,-75.932519195,-75.932336262,-75.932264764,-75.932953569,-75.932938167,-75.933098289,
-75.932503985,-75.932597591,-75.932551382,-75.932541384,-75.932575066,-75.932751274,-75.932869969,-75.932086405,-75.932125915,-75.932089623,-75.932229816,-75.932356252,-75.93221234,-75.932505964,-75.932455199,
-75.932672148,-75.932823439,-75.93266258,-75.932722695,-75.93262497,-75.932613958,-75.932726832,-75.933179618,-75.933413275,-75.932911947,-75.93293013,-75.933129681,-75.933348106,-75.933328068,-75.9333501,
-75.933133529,-75.93306104,-75.933020824,-75.933056158,-75.933261164,-75.933157803,-75.933320158,-75.93306193,-75.932935915,-75.933125758,-75.933088069,-75.933158642,-75.9331282,-75.933096121,-75.933250109,
-75.933325084,-75.933336448,-75.934785616,-75.934843128,-75.93387422,-75.933996517,-75.934114484,-75.934560855,-75.935138185,-75.935228902,-75.935550248,-75.935326059,-75.935167468,-75.935038326,-75.934937151,
-75.934476218,-75.934576771,-75.934556169,-75.934324709,-75.934215059,-75.934185509,-75.933996183,-75.938853557,-75.937435702,-75.93755249,-75.93709863,-75.937584727,-75.937080786,-75.93717527,-75.937158245,
-75.937153622,-75.937255458,-75.937291351,-75.937463492,-75.937508635,-75.937568922,-75.937604,-75.937643152,-75.937538299,-75.936224493,-75.936538213,-75.936653234,-75.936672687,-75.936781092,-75.936765158,
-75.936775048,-75.93680606,-75.936808197,-75.936753824,-75.936637658,-75.936923553,-75.936872045,-75.936871187,-75.936735385,-75.936800934,-75.936504657,-75.936528774,-75.936462867,-75.936301988,-75.936248282,
-75.936192436,-75.935933385,-75.93679036,-75.936984567,-75.937178376,-75.937072594,-75.936212479,-75.937100912,-75.937075027,-75.93703418,-75.936553923,-75.936563813,-75.936750108,-75.935328068,-75.93329076,
-75.933274837,-75.932816577,-75.932958943,-75.932872736,-75.933039998,-75.932930987,-75.932975423,-75.932987859,-75.932944342,-75.932984985,-75.933102016,-75.933042959,-75.935432474,-75.93539475,-75.935456177,
-75.935413297,-75.935564812,-75.936518316,-75.935680005,-75.936558194,-75.935736741,-75.935754977,-75.935809,-75.935866569,-75.936134435,-75.936272398,-75.936252114,-75.936497277,-75.936178069,-75.933545359,
-75.933462287,-75.933528848,-75.933456247,-75.933508043,-75.933443108,-75.933436682,-75.933293086,-75.933458306,-75.932948828,-75.933541322,-75.933719067,-75.933560447,-75.934586709,-75.934531055,-75.93416494,
-75.933882234,-75.934830229,-75.934978045,-75.934357619,-75.934605828,-75.934754661,-75.934743056,-75.934130125,-75.935928887,-75.936286533,-75.936425628,-75.936477105,-75.935622798,-75.935607342,-75.936576534,
-75.936823941,-75.936664385,-75.936985859,-75.936927641,-75.937655315,-75.93754798,-75.937409554,-75.937780814,-75.936920843,-75.93724831,-75.937473965,-75.937712006,-75.935331673,-75.936250622,-75.934986449,
-75.938144151,-75.938287148,-75.938572438,-75.938677207,-75.938737192,-75.936696505,-75.9379094,-75.937601482,-75.931082221,-75.931152233,-75.931929379,-75.931886037,-75.931539305,-75.93145414,-75.931517537,
-75.93206476,-75.931104594,-75.930886831,-75.930796839,-75.930770692,-75.934395391,-75.933485857,-75.935094793,-75.935243938,-75.934978751,-75.935325475,-75.935361712,-75.933975927,-75.933883586,-75.936299827,
-75.934936738,-75.935015301,-75.934930658,-75.935287011,-75.935294894,-75.937784172,-75.937770775,-75.938253481,-75.93826076,-75.937784726,-75.93717805,-75.938872368,-75.938875092,-75.939336652,-75.940266037,
-75.940331239,-75.940421181,-75.940331999,-75.940177713,-75.939332917,-75.938994759,-75.939607395,-75.939598636,-75.939560673,-75.939534037,-75.939555948,-75.939015855,-75.939243491,-75.938789939,-75.933198497,
-75.93296926,-75.933132717,-75.932772368,-75.932419051,-75.93293841,-75.932798596,-75.932208745,-75.93206523,-75.931983351,-75.932410373,-75.931891975,-75.931568921,-75.931771254,-75.932397243,-75.931396196,
-75.931519619,-75.932093909,-75.931942073,-75.934429867,-75.934438719,-75.93453334,-75.934266886,-75.934183909,-75.93452075,-75.933856314,-75.933881074,-75.933901224,-75.933751983,-75.933594864,-75.93358154,
-75.93347677,-75.933895768,-75.933917682,-75.933687372,-75.933927415,-75.933739282,-75.933891053,-75.933712267,-75.93361711,-75.933901067,-75.934161321,-75.934305249,-75.934239461,-75.934211658,-75.933980238,
-75.934018133,-75.93397582,-75.933918536,-75.933971179,-75.933877169]
df["POINT_Y"]=[38.95259201,38.952468493,38.952585964,38.952220643,38.952172451,38.952978948,38.952611101,38.952620123,38.952527583,38.952013642,38.951971095,38.951950598,38.951878617,38.951867573,38.952051039,38.952319899,
38.952751776,38.952261808,38.951645828,38.951591344,38.951583443,38.951660428,38.951750197,38.951752666,38.951776696,38.951792968,38.951787078,38.951862848,38.951800999,38.951744805,38.951870508,38.951889649,
38.951936158,38.95170948,38.951751749,38.951735386,38.951742727,38.951588575,38.951528477,38.951520106,38.951519453,38.951936698,38.952010261,38.952013956,38.952102079,38.952165877,38.952146088,38.952089106,
38.952117254,38.952151545,38.949969545,38.951201998,38.951159228,38.951123753,38.950778391,38.950531943,38.950989092,38.950097211,38.950208568,38.950065183,38.950071356,38.949923603,38.9498474,38.949809668,
38.949757376,38.949571133,38.951447294,38.95147755,38.950581745,38.950733667,38.951069352,38.951237478,38.95107276,38.95096753,38.9508122,38.950734862,38.950688169,38.950514372,38.950075351,38.950010511,38.949960875,
38.949992064,38.95007398,38.950101272,38.950295815,38.950227769,38.950211517,38.950441255,38.950335632,38.95024686,38.950307666,38.950528546,38.950513096,38.950187972,38.950217841,38.950263645,38.950510523,
38.950755399,38.950708302,38.950286311,38.950229957,38.950164615,38.950045229,38.949970825,38.949877169,38.949993101,38.949660647,38.949543522,38.949625589,38.949412861,38.949487811,38.949880172,38.951839048,
38.952063455,38.949880835,38.951913953,38.949897842,38.949754481,38.949913573,38.951052934,38.951134326,38.951215119,38.951281057,38.951294341,38.951397886,38.951533389,38.951672146,38.949658462,38.950068808,
38.949883166,38.949852263,38.949919533,38.950057898,38.950028999,38.950188832,38.950304129,38.950435138,38.950514515,38.950622084,38.950381874,38.949994828,38.950052327,38.949830647,38.949824853,38.949732702,
38.949761675,38.949791427,38.949879419,38.949914074,38.949955099,38.951691376,38.951766177,38.951785811,38.951832242,38.951733008,38.950873805,38.951440038,38.951405074,38.951254936,38.951212584,38.951201821,
38.951198089,38.951901959,38.94884403,38.948941748,38.949353979,38.949035993,38.949016785,38.94887402,38.948802413,38.948722997,38.94868013,38.948698153,38.948609493,38.948407937,38.948413538,38.94884251,
38.948821237,38.948818421,38.948795076,38.949678178,38.949281509,38.949751466,38.949261269,38.949715525,38.949652229,38.949566304,38.949532396,38.949542936,38.949567821,38.94953658,38.949563742,38.948735942,
38.952147575,38.952155751,38.951912912,38.951985954,38.952728799,38.952622921,38.952451597,38.952436249,38.95231594,38.952313127,38.951745893,38.952390373,38.952286187,38.952708734,38.951839413,38.952030386,
38.951616852,38.951420298,38.951608998,38.952554863,38.9520134,38.951292914,38.951667791,38.952112184,38.954031241,38.953799626,38.953837241,38.953853864,38.953692287,38.953686947,38.953751245,38.953616457,
38.95369262,38.953694331,38.953744736,38.953742862,38.953858308,38.953767308,38.953659111,38.953499777,38.953494864,38.953676808,38.953570088,38.953574927,38.953146008,38.953138966,38.953219752,38.953218684,
38.953196026,38.953217491,38.953260642,38.953365184,38.953343071,38.953392347,38.95584336,38.955799692,38.956182326,38.95621302,38.956049617,38.957470088,38.957171152,38.956453402,38.956649954,38.956791692,
38.957180989,38.957521592,38.955754158,38.95553646,38.955953035,38.956405511,38.956660878,38.957086511,38.957423389,38.957793854,38.957835976,38.955448024,38.955021013,38.954934154,38.954927544,38.954598007,
38.954570833,38.954367294,38.954343,38.954497793,38.954471,38.954821256,38.954369125,38.955348715,38.955333171,38.955343991,38.955489753,38.955493927,38.955516735,38.955049181,38.955110383,38.954724398,38.954521524,
38.954517463,38.954512208,38.954493542,38.954434212,38.954117479,38.95435162,38.954310712,38.954277052,38.954161078,38.954580606,38.954197375,38.955451505,38.955596079,38.955045523,38.955097295,38.955970146,
38.954232335,38.95411988,38.953505553,38.955288869,38.955759644,38.955647996,38.955040953,38.954949777,38.95485026,38.954643337,38.954546745,38.953547289,38.953542137,38.953995634,38.954146947,38.954862356,
38.953287566,38.954523419,38.954915863,38.955002144,38.954945777,38.955006524,38.95507815,38.955120243,38.953067979,38.953073084,38.953453648,38.953640022,38.953641026,38.954062633,38.954027667,38.954110137,
38.954249401,38.953874232,38.953529725,38.953628972,38.953476826,38.95351151,38.953498365,38.953491846,38.953767787,38.953843351,38.953849161]
#Must incorporate these identifiers and cluster by similarity of species in addition to their proximity.
df["Category"]=['Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla',
'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla',
'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior',
'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior',
'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa',
'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa',
'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii',
'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii',
'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Buddleia x weyeriana',
'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa',
'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyfoccia gracilis',
'Chamaecyfoccia gracilis', 'Chamaecyfoccia gracilis', 'Chamaecyfoccia gracilis', 'Chamaecyfoccia gracilis', 'Chamaecyfoccia gracilis', 'Chamaecyfoccia gracilis', 'Chamaecyfoccia gracilis', 'Chamaecyparis pisifera',
'Chamaecyparis pisifera', 'Chamaecyparis pisifera', 'Chamaecyparis pisifera', 'Chamaecyparis pisifera', 'Chamaecyparis pisifera', 'Chamaecyparis pisifera', 'Chamaecyparis pisifera', 'Chamaecyparis pisifera',
'Chamaecyparis pisifera', 'Chamaecyparis pisifera', 'Chamaecyparis pisifera', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba',
'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia',
'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia',
'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia',
'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis',
'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis', 'Euonymus alata', 'Euonymus alata', 'Euonymus alata', 'Euonymus alata', 'Euonymus alata', 'Euonymus alata', 'Euonymus alata', 'Euonymus alata',
'Euonymus alata', 'Euonymus alata', 'Euonymus alata', 'Euonymus alata', 'Euonymus alata', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima',
'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima',
'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis',
'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalisodoratum',
'Galanthus nivalisodoratum', 'Galanthus nivalisodoratum', 'Galanthus nivalisodoratum', 'Galanthus nivalisodoratum', 'Galanthus nivalisodoratum', 'Galanthus nivalisodoratum', 'Galanthus nivalisodoratum',
'Galanthus nivalisodoratum', 'Galanthus nivalisodoratum', 'Galanthus nivalisodoratum', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra',
'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra',
'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra',
'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra', 'Ilex crenata Hetzii',
'Ilex crenata Hetzii', 'Ilex crenata Hetzii', 'Ilex crenata Hetzii', 'Ilex crenata Hetzii', 'Ilex crenata Hetzii', 'Ilex crenata Hetzii', 'Ilex crenata Hetzii', 'Ilex crenata Hetzii', 'Ilex crenata Hetzii',
'Ilex crenata Hetzii', 'Ilex crenata Hetzii', 'Iberis sempervirens', 'Iberis sempervirens', 'Iberis sempervirens', 'Iberis sempervirens', 'Iberis sempervirens', 'Iberis sempervirens', 'Iberis sempervirens',
'Iberis sempervirens', 'Iberis sempervirens', 'Lamium maculatum', 'Lamium maculatum', 'Lamium maculatum', 'Lamium maculatum', 'Lamium maculatum', 'Lamium maculatum', 'Lamium maculatum', 'Lamium maculatum',
'Lamium maculatum', 'Lamium maculatum', 'Lamium maculatum', 'Lamium maculatum', 'Mertensia virginica', 'Mertensia virginica', 'Mertensia virginica', 'Mertensia virginica', 'Mertensia virginica', 'Mertensia virginica',
'Mertensia virginica', 'Mertensia virginica', 'Aristolochata pseudophilus', 'Aristolochata pseudophilus', 'Aristolochata pseudophilus', 'Aristolochata pseudophilus', 'Aristolochata pseudophilus',
'Aristolochata pseudophilus', 'Aristolochata pseudophilus', 'Aristolochata pseudophilus', 'Aristolochata pseudophilus', 'Aristolochata pseudophilus', 'Chamaecyparis duplicatus', 'Chamaecyparis duplicatus',
'Chamaecyparis duplicatus', 'Chamaecyparis duplicatus', 'Chamaecyparis duplicatus', 'Chamaecyparis duplicatus', 'Chamaecyparis duplicatus', 'Chamaecyparis duplicatus', 'Chamaecyparis crenata Hetzii',
'Chamaecyparis crenata Hetzii', 'Chamaecyparis crenata Hetzii', 'Chamaecyparis crenata Hetzii', 'Chamaecyparis crenata Hetzii', 'Chamaecyparis crenata Hetzii', 'Chamaecyparis crenata Hetzii', 'Chamaecyparis',
'Chamaecyparis', 'Chamaecyparis', 'Chamaecyparis', 'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum',
'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum',
'Veronicastrum vulgaris', 'Veronicastrum vulgaris', 'Veronicastrum vulgaris', 'Veronicastrum vulgaris', 'Veronicastrum vulgaris', 'Veronicastrum vulgaris', 'Veronicastrum vulgaris', 'Veronicastrum vulgaris',
'Veronicastrum vulgaris', 'Veronicastrum pulchra', 'Veronicastrum pulchra', 'Veronicastrum pulchra', 'Veronicastrum pulchra', 'Veronicastrum pulchra', 'Veronicastrum pulchra', 'Veronicastrum pulchra',
'Veronicastrum pulchra', 'Veronicastrum pulchra', 'Veronicastrum pulchra']
#Get clusters with MeanShift
X= np.array(df.loc[:,["POINT_X","POINT_Y"]].values.tolist()) # Only using numeric values for now
bandwidth = estimate_bandwidth(X, quantile=0.0595, n_samples=15000)
ms = MeanShift(bandwidth=bandwidth, bin_seeding=True)
ms.fit(X)
labels = ms.labels_
cluster_centers = ms.cluster_centers_
labels_unique = np.unique(labels)
n_clusters_ = len(labels_unique)
print("Estimated number of clusters: %d" % n_clusters_)
#Make plot
plt.figure(1)
plt.clf()
colors = cycle('bgrcmykbgrcmykbgrcmykbgrcmyk')
for k, col in zip(range(n_clusters_), colors):
my_members = labels == k
cluster_center = cluster_centers[k]
plt.plot(X[my_members, 0], X[my_members, 1], col + '.')
plt.plot(cluster_center[0], cluster_center[1], 'o', markerfacecolor=col,
markeredgecolor='k', markersize=14)
plt.title('Clusters found by X/Y proximity (before using categorical values): %d' % n_clusters_)
plt.show(); plt.show()