15

I'm getting a new deprecation warning in an IPython notebook I wrote that I've not seen before. What I'm seeing is the following:

X,y = load_svmlight_file('./GasSensorArray/batch2.dat')
/Users/cpd/.virtualenvs/py27-ipython+pandas/lib/python2.7/site-packages/sklearn/datasets/svmlight_format.py:137: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
return _load_svmlight_file(f, dtype, multilabel, zero_based, query_id)
/Users/cpd/.virtualenvs/py27-ipython+pandas/lib/python2.7/site-packages/sklearn/datasets/svmlight_format.py:137: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
return _load_svmlight_file(f, dtype, multilabel, zero_based, query_id)
...

Any thoughts on what might be the issue here? I took another look at my data file and at first glance, I don't see any obvious issue. I'm not sure what I changed in my system setup that would have caused this. I've got v. 0.14.1 of scikit-learn installed.

Chris
  • 3,109
  • 7
  • 29
  • 39

2 Answers2

13

You probably upgraded the numpy version, as this is a numpy 1.8.0 deprecation warning. Explained in this pull request. Continuation in this PR.

Briefly browsing the sklearn issue tracker, I haven't found any related issues. You can probably search better and file a bug report if not found.

BrechtDeMan
  • 6,589
  • 4
  • 24
  • 25
alko
  • 46,136
  • 12
  • 94
  • 102
  • My hunch was that it was a numpy upgrade causing this. I actually tried reverting to numpy 1.7.1 which caused some API error. – Chris Nov 19 '13 at 23:40
  • @Chris IMHO, as this is just a warning, i.e. promise of errors for __future__ versions of numpy, you may stay at current numpy version until sklearn will be properly patched. – alko Nov 19 '13 at 23:42
  • 1
    FYI, it was fixed in this `scikit-learn` pull request (though unfortunately as of version 0.14.1 this hasn't made its way into a release yet): https://github.com/scikit-learn/scikit-learn/pull/2794 – mjjohnson Apr 24 '14 at 02:00
6

After you upgrade numpy, it gives you this deprecation warning whenever you try to index an array using non-integer numbers. In sklearn there are many places where the data type is a floating point number even though the indices are all integer values when computed.

So whenever you index an array in numpy, you need to make sure the indices are integer typed. But this is not the case in many places in sklearn. The fix is sometimes trivial (for example use // instead of / when computing indices using divisions), sometimes not, but for now, no worries, it's just a warning.

BrechtDeMan
  • 6,589
  • 4
  • 24
  • 25
adrin
  • 4,511
  • 3
  • 34
  • 50