My dataset is composed of records of music streamings from users. I have around 100 different music genres and I would like to cluster them depending on the distribution of ages of listeners.
To be more clear, ages of users are divided into "blocks" (A1: 0-10 years; A2: 11-20 years,..., A6: 61+) and thus an example of the data I would like to cluster is the following:
Pop: 0.05 A2; 0.3 A3; 0.35 A3; 0.2 A4; 0.05 A5; 0.05 A6
Rock: 0.05 A2; 0.2 A3; 0.2 A3; 0.1 A4; 0.15 A5; 0.1 A6
I would like to obtain clusters of genres with similar distributions. How can I do this in Python? Can I just treat each genre as a datapoint in a 6-dimensional space or should I use something more refined? For example, can I use a custmized distance for distirbutions in a clustering algorithm?
Thank you