I am coding my application each function so i am not using tools which does everything for you
Been looking for solution when to cut my agglomerative hierarchical clustering
How do i cluster?
I have coded application in c# 4.5.2
So far i am using standard hierarchical which uses Euclidean_Distance to calculate distance between document pairs
Also it uses UPGMA to calculate distance between clusters to decide merge which ones
I also coded Rand Index and F Measure to test my manually labeled data-set success
However the problem is when stop merging more clusters
I am really bad at understanding mathematical equations without real data example or a well explained pseudo code
There are mathematical equations everywhere but no real life example
So looking for your answers. For example it is written in many places Bayesian information criterion (BIC) is a good solution but i cant figure out how to apply it to my software
I also have other distance or similarity metrics such as cosine similarity or Sorensen Dice Distance etc
There are so many questions on stackexchange or stackoverflow about this but all answers are using tools
like matlab or R or etc