0

How do I load a excel file from my system in SciKit for train and test.

I tried pandas for importing, i was able to import it using

data=pd.read_excel("IRIS.xlsx")

print(data.head()) but then am not able to perform this line X,y=data(return_X_y=True) giving me a dataframe error. what should i do or import before this line so i am able to carry out train and test successfully code and error

2 Answers2

0

data is a pandas Dataframe and it does not have any method return_X_y=True. This is only present in most sklearn load data functions e.g. `sklearn.datasets.load_XXX```.

In your case you load the data manually thus, you just need to slice and select the desired columns.

y = data["target"].values

# select all columns except the target
X = data.loc[:, data.columns != "target"].values
seralouk
  • 30,938
  • 9
  • 118
  • 133
0

Use .iloc:

data = pd.read_excel('IRIS.xlsx')
X, y = data.iloc[:, :-1], data.iloc[-1]

If you really want to use iris dataset, use sklearn directly:

from sklearn.datasets import load_iris

iris = load_iris()
X, y = iris['data'], iris['target']
Corralien
  • 109,409
  • 8
  • 28
  • 52