9

I have the following code

train_X, test_X, train_y, test_y = train_test_split(X.as_matrix(), y.as_matrix(), test_size=0.25)

where X is a DataFrame and y is a series. When calling the function above, I get the following warning:

/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:1: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.

"""Entry point for launching an IPython kernel.

Then I tried to change using .values as mentioned in the warning:

train_X, test_X, train_y, test_y = train_test_split(X.values(), y.values(), test_size=0.25)

But I get the following error:

TypeError Traceback (most recent call last) in () ----> 1 train_X, test_X, train_y, test_y = train_test_split(X.values(), y.values(), test_size=0.25)

TypeError: 'numpy.ndarray' object is not callable

How do I solve this?

rcs
  • 6,713
  • 12
  • 53
  • 75
  • 6
    That should be as simple as removing `()` from `values()`. – norok2 Sep 18 '18 at 08:44
  • you are right, my bad.. – rcs Sep 18 '18 at 08:55
  • 1
    From pandas 0.24, use `df.to_numpy()`, not `.values` or `as_matrix()` either. – cs95 Nov 14 '19 at 23:12
  • https://pandas.pydata.org/pandas-docs/version/0.23.1/generated/pandas.DataFrame.as_matrix.html Though they suggest using `.values`, as @cs95 mentioned, `.to_numpy()` was the one that worked for me – Suraj Sep 02 '20 at 10:17

3 Answers3

13

It should be:

train_X, test_X, train_y, test_y = train_test_split(X.values, y.values, test_size=0.25)

See this.

Deepak Saini
  • 2,810
  • 1
  • 19
  • 26
3

According to Panda 0.25.1 documentation, they recommend more using DataFrame.to_numpy() than DataFrame.values()

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.values.html#pandas.DataFrame.values

So I'd like to suggest to update it like below:

train_X, test_X, train_y, test_y = train_test_split(X.to_numpy(), y.to_numpy(), test_size=0.25)
Bongsang
  • 41
  • 4
1

Here's some additional info regarding the versioning behind the warning. I hope it helps.

It occurred for me due to pandas version (0.23.4) that is shipped now in SQL Server 2019 along with Anaconda Python 3.7.1. SQL Server 2017 shipped with pandas (0.19.2) which is part of Anaconda Python 3.5.2 where this Future Warning msg did not occur.

pandas.DataFrame.as_matrix got deprecated since version 0.23.0. See PR.

Examples of how to use pandas.DataFrame.values.

Hiram
  • 409
  • 1
  • 4
  • 13