We are going through a task which is about Text Categorization and we use one of unsupervised machine learning Models.
before we do text Clustering, there are several steps that the data set must go through such as cleaning it from the stop words extract the stem words form the text and then getting the Feature Selection.
Reading about feature selection, there are several methods that i can apply for feature selection such as Information Gain, Gini Index and Mutual Information.
I would like to know the nature of the these methods and how i can implement them in the coding part, is there any library that i can use to perform these task.