0

This is my second question in stack overflow. I don't have to much experience with python, but had excellent results with my first question and I was able to implement the code from the answer, so I will try again with this new problem:

I am trying to classify syllable types from a canary song, in order to use each types as templates to find and classify large sets of data with similar behavior. I use the envelope of the singing. My data is a sampled array, with time and amplitude (a plot of the data is posted in http://ceciliajarne.web.unq.edu.ar/envelope-problem/ ). I try to use singular value decomposition algorithm from Numpy:

U,s,V = linalg.svd(A) # SVD decomposition of A

I'm not sure how to build a meaningful A matrix with the data of time series in order to follow this approach. How to cut the time series to obtain a matrix to analyze it?

I thought a possible second approach: the Hierarchical clustering. It may be a better solution, but I don't know how to use a clustering criteria. What I know is that:

  • There are around 10 different syllable types.
  • The distance between the minims and relative maximum in each type changes.
  • Also the length of each syllable. -Similar syllable has similar frequency behavior.

Which information can I use to fed the scipy.cluster.hierarchy. Function? I want to group the common syllable types in clusters.

I was inspired by: Unsupervised clustering with unknown number of clusters

But now I don't know how to implement a first test... Any idea could be very useful, this is the first time for me with patterns and time series.

Community
  • 1
  • 1
Cecilia
  • 165
  • 1
  • 4
  • 11
  • SVD is typically used for dimensionality reduction rather than finding clusters per se, although reducing the dimensionality of your data can be a useful preprocessing step for many clustering methods. It would be helpful if you could tell us something about the format of your data. – ali_m Oct 31 '15 at 21:28
  • Thank you very much. My data is a 1d array representing the amplitude of of the sound envelope sampled. There is a link with one example of how it looks like when you plot the data. – Cecilia Nov 01 '15 at 16:57
  • Presumably you have multiple of those 1D arrays. Do you consider each one of those to be a single syllable, or does each array consist of a sequence of syllables? – ali_m Nov 01 '15 at 17:22
  • I have to cut the single sillable. There are around 15 different types, but I have to extract them from the sound file, and they have different size so I don't know how to chose a criteria to chop them. – Cecilia Nov 02 '15 at 19:06
  • I fond a way to compare the syllables using the correlation function from scipy. – Cecilia Dec 30 '15 at 18:39

0 Answers0