15

I'm trying to classify some EEG data using a logistic regression model (this seems to give the best classification of my data). The data I have is from a multichannel EEG setup so in essence I have a matrix of 63 x 116 x 50 (that is channels x time points x number of trials (there are two trial types of 50), I have reshaped this to a long vector, one for each trial.

What I would like to do is after the classification to see which features were the most useful in classifying the trials. How can I do that and is it possible to test the significance of these features? e.g. to say that the classification was drive mainly by N-features and these are feature x to z. So I could for instance say that channel 10 at time point 90-95 was significant or important for the classification.

So is this possible or am I asking the wrong question?

any comments or paper references are much appreciated.

Mads Jensen
  • 663
  • 2
  • 6
  • 13

1 Answers1

39

Scikit-learn includes quite a few methods for feature ranking, among them:

(see more at http://scikit-learn.org/stable/modules/feature_selection.html)

Among those, I definitely recommend giving Randomized Logistic Regression a shot. In my experience, it consistently outperforms other methods and is very stable. Paper on this: http://arxiv.org/pdf/0809.2932v2.pdf

Edit: I have written a series of blog posts on different feature selection methods and their pros and cons, which are probably useful for answering this question in more detail:

Ando Saabas
  • 1,967
  • 14
  • 12
  • 2
    The non-randomized L1-penalized models are also nice (i.e. L1 penalized Logistic regression and LinearSVC). I don't have much experience with the randomized versions yet. – Andreas Mueller Apr 04 '13 at 12:39
  • Second @AndreasMueller's suggestion, L1-penalty SVM is a surprisingly good feature selection algorithm for some tasks (that look nothing like EEG reading, so YMMV). The [document classification example](http://scikit-learn.org/stable/auto_examples/document_classification_20newsgroups.html#example-document-classification-20newsgroups-py) does this, see `L1LinearSVC` there. – Fred Foo Apr 04 '13 at 13:01
  • 2
    In my experience, the case where the non-randomized methods can fail is where you have strongly multicollinear features, in which case some features can be among the top ones on one subset of the data, while being regularized out for another subset. – Ando Saabas Apr 04 '13 at 13:35
  • You're right. Just think it is worth a shot. It won't do worth than univariate ;) – Andreas Mueller Apr 05 '13 at 13:10
  • 1
    @snarly the document classification example has been moved to http://scikit-learn.org/stable/auto_examples/text/document_classification_20newsgroups.html#example-document-classification-20newsgroups-py – Marco Bonzanini Feb 18 '16 at 14:41
  • 1
    RandomizedLogisticRegression is being deprecated :( DEPRECATED: The class RandomizedLogisticRegression is deprecated in 0.19 and will be removed in 0.21. :( – asimo Mar 02 '18 at 10:52
  • I think this post could be improved/updated be considering too the library [ile5](http://eli5.readthedocs.io/en/latest/overview.html). [Here](https://stackoverflow.com/a/49937090/7127519) a post with examples too in a similar discussion. They mentioned both [ile5](http://eli5.readthedocs.io/en/latest/overview.html) and [treeinterpreter](https://github.com/andosa/treeinterpreter) as in this answer. – Rafael Valero Apr 20 '18 at 08:46