48

I am trying to split my dataset into training and testing dataset, but I am getting this error:

X_train,X_test,Y_train,Y_test = sklearn.cross_validation.train_test_split(X,df1['ENTRIESn_hourly'])

AttributeError                            Traceback (most recent call last)
<ipython-input-53-5445dab94861> in <module>()
----> 1 X_train,X_test,Y_train,Y_test = sklearn.cross_validation.train_test_split(X,df1['ENTRIESn_hourly'])

AttributeError: module 'sklearn' has no attribute 'cross_validation'

How can I handle this?

alkasm
  • 22,094
  • 5
  • 78
  • 94
Naren
  • 491
  • 1
  • 4
  • 4
  • How are you importing `sklearn`? Did you try the [many](https://stackoverflow.com/questions/16743889/cant-use-scikit-learn-attributeerror-module-object-has-no-attribute) [solutions](https://stackoverflow.com/questions/40496969/attributeerror-module-sklearn-metrics-has-no-attribute-calinski-harabaz-scor) found online? – Antimony Oct 04 '17 at 19:17
  • This is so annoying that we can't just use `import sklearn as sk` and simply start using any submodules from `sk.metrics.etc` and we have to manually add hundreds of import statements cluttering the code base and making it extremely difficult to follow any logic in the code. I hope people at `sklearn` take notice and fix this at some point. – Kirk Walla Apr 24 '22 at 10:26

6 Answers6

127

sklearn does not automatically import its subpackages. If you only imported via: import sklearn, then it won't work. Import with import sklearn.cross_validation instead.

Further, sklearn.cross_validation will be deprecated in version 0.20. Use sklearn.model_selection.train_test_split instead.

Brenden Petersen
  • 1,993
  • 1
  • 9
  • 10
  • 4
    Beat me to the punch. Welcome to Stack Overflow! This answer would be even better with some [linked sources](http://scikit-learn.org/0.19/modules/generated/sklearn.cross_validation.train_test_split.html) :) – alkasm Oct 04 '17 at 19:19
8

Try this:

from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.33, random_state=101)
gogasca
  • 9,283
  • 6
  • 80
  • 125
2

you can try this

X_train,X_test,Y_train,Y_test = 
    sklearn.model_selection.train_test_split(X,boston_df.price)
Nic3500
  • 8,144
  • 10
  • 29
  • 40
2

The equivalent to cross_validation in sklearn is:

  sklearn.model_selection
Enrique Benito Casado
  • 1,914
  • 1
  • 20
  • 40
2

"cross_validation" name is now deprecated and was replaced by "model_selection" inside the new anaconda versions. So you can use

from sklearn.model_selection import train_test_split
Joel K Thomas
  • 228
  • 1
  • 4
  • 15
0

Thanks! Successful with this in Colab:

    from sklearn.model_selection import train_test_split