1

I attempting evaluate my feature results by performing a chi-squared test using sklearns chi2 library http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.chi2.html.

I used the methods in this thread: Python sklearn - how to calculate p-values in order to obtain my p-values. A cutout of my p-values looks like:

nan, 1.0, 1.0, 0.9999999999999872, nan, nan, nan, nan, nan

I have a very large number of nan's. Why is this? Does this mean my results are insignificant assuming a 0.05 significance test?

Community
  • 1
  • 1
jeffrey
  • 3,196
  • 7
  • 26
  • 44
  • What kind of data are you fitting in? – Fred Foo Dec 12 '14 at 21:26
  • My data is a vectorized version of text features (words or phrases), taken from a large text corpus. My target data are scores. For example, there would be a score (0 - 100) associated with every data entry, which would be a large number of sentences. – jeffrey Dec 12 '14 at 21:58

0 Answers0