Saving prediction results to CSV

Question

I am storing the results from a sklearn regression model to the varibla prediction.

prediction = regressor.predict(data[['X']])
print(prediction)

The values of the prediction output looks like this

[ 266.77832991  201.06347505  446.00066136  499.76736079  295.15519906
  214.50514991  422.1043505   531.13126879  287.68760191  201.06347505
  402.68859792  478.85808879  286.19408248  192.10235848]

I am then trying to use the to_csv function to save the results to a local CSV file:

prediction.to_csv('C:/localpath/test.csv')

But the error I get back is:

AttributeError: 'numpy.ndarray' object has no attribute 'to_csv'

I am using Pandas/Numpy/SKlearn. Any idea on the basic fix?

DavidK · Answer 1 · 2016-01-18T22:07:17.503

36

You can use pandas. As it's said, numpy arrays don't have a to_csv function.

import numpy as np
import pandas as pd
prediction = pd.DataFrame(predictions, columns=['predictions']).to_csv('prediction.csv')

add ".T" if you want either your values in line or column-like.

edited Jan 18 '16 at 22:07

answered Jan 18 '16 at 21:54

DavidK

2,495
3
23
38

8

If I want to merge with a unique identifier from `X_test` ("id" column, not the index), will the prediction results correctly match every row? as in: `output=pd.DataFrame(data={"id":X_test["id"],"Prediction":y_pred})` `output.to_csv(path_or_buf="..\\output\\results.csv",index=False,quoting=3,sep=';')` – mrbTT May 27 '18 at 15:03
If X_test has the same lenght as y_pred, the answer is yes. – DavidK Oct 08 '19 at 11:40

Ali · Answer 2 · 2016-01-18T22:25:40.023

17

You can use the numpy.savetxt function:

numpy.savetxt('C:/localpath/test.csv',prediction, ,delimiter=',')

and to load a CSV file you can use numpy.genfromtxt function:

numpy.genfromtxt('C:/localpath/test.csv', delimiter=',')

edited Jan 18 '16 at 22:25

answered Jan 18 '16 at 22:09

Ali

1,605
1
13
19

I had reshape my data after loading i.e: "pred_train = np.genfromtxt('encoded1.csv', delimiter=" ").reshape(-1, 1)", isn't there a way to save and load the data without thinking about reshaping it? – Saber Feb 06 '19 at 21:49

score 5 · Answer 3 · answered Sep 23 '19 at 14:26

It is a very detailed solution cases like those but you can use it even in production.

First Save the Model

joblib.dump(regressor, "regressor.sav")

Save columns in order

pd.DataFrame(X_train.columns).to_csv("feature_list.csv", index = None)

Save data types of train set

pd.DataFrame(X_train.dtypes).reset_index().to_csv("data_types.csv", index = None)

Using it again:

feature_list = pd.read_csv("feature_list.csv")
feature_list = pd.Index(list(feature_list["0"]))

add_cols = list(feature_list.difference(X_test.columns))

drop_cols = list(X_test.columns.difference(feature_list))

for col in add_cols:
    X_test[col] = np.nan

for col in drop_cols:
    X_test = X_test.drop(col, axis = 1)

# reorder columns
X_test = X_test[feature_list]

types = pd.read_csv("data_types.csv")
for i in range(len(types)):
    X_test[types.iloc[i,0]] = X_test[types.iloc[i,0]].astype(types.iloc[i,1])

Make Predictions

regressor = joblib.load("regressor.sav")
predictions = regressor.predict(X_test)

Save Prediction Results

res = pd.DataFrame(predictions)
res.index = X_test.index # its important for comparison
res.columns = ["prediction"]
res.to_csv("prediction_results.csv")

Enjoy end to end model/prediction saver code!

score 0 · Answer 4 · answered Aug 18 '23 at 10:10

0

predictions=regressor.predict(send_to_model)
#print(predictions)
output=pd.DataFrame({"Survived":predictions})
output.to_csv('C:/Users/<username>/Downloads/predictions.csv',index=False)

answered Aug 18 '23 at 10:10

sasi bhushan

1

Avoid code only anwser anw provide an explanation. – Itération 122442 Aug 18 '23 at 12:54

Saving prediction results to CSV

4 Answers4

Linked

Related