How can I perform clustering by groups? For example, take this Pokemon dataset on Kaggle.
A sample of this dataset looks like this (changed some fields to mimic my data):
Name Type I Type II
Bulbasaur Grass Poison
Bulbasaur 2 Grass Poison
Venusaur Grass Not Null
VenusaurMega Venusaur Grass Not Null
...
Charizard Fire Flying
CharizardMega Charizard X Fire Dragon
Supposing there are no nulls in my dataset, how can I group by the Type I and Type II columns respectively, and then cluster by similarity between names?
The output should be like so:
Name Type I Type II Cluster
Bulbasaur Grass Poison 1
Bulbasaur 2 Grass Poison 1
Venusaur Grass Not Null 2
VenusaurMega Venusaur Grass Not Null 2
...
Charizard Fire Flying 3
CharizardMega Charizard X Fire Dragon 4
I tried a method similar as shown here, but it doesn't work with the NbClust function I am using.
clust <- NbClust(data, diss= string_dist, distance=NULL, min.nc = 2, max.nc = 125, method="ward.D2", index="ch")