how to plot a scatter plot for the following data in python?

Question

I did splitting among trained and testing data using the function train_test_split() and get the following.

print(X_train) +--------------------+ | fre loc | +--------------------+ | 1.208531 0.010000 | | 0.169742 0.010000 | | 0.119691 0.010000 | | 0.151515 0.010000 | | 0.632653 0.010000 | | 0.104000 1.125000 | | 3.313433 1.076923 | | 0.323899 0.010000 | | 3.513011 1.100000 | | 0.184971 0.010000 | | 0.158470 0.010000 | | 0.175258 0.010000 | | 0.149038 0.010000 | | 0.158879 0.010000 | +--------------------+

print(X_test) +--------------------+ | fre loc | +--------------------+ | 1.208531 0.010000 | | 0.169742 0.010000 | | 0.119691 0.010000 | | 0.151515 0.010000 | | 0.632653 0.010000 | | 0.104000 1.125000 | | 3.313433 1.076923 | +--------------------+

print(y_train) +----------------+ | Critical Value | +----------------+ | 1.208531 | | 0.000000 | | 0.000000 | | 0.000000 | | 0.632653 | | 1.125000 | | 4.390356 | | 0.000000 | | 4.613011 | | 0.000000 | | 0.000000 | | 0.000000 | | 0.000000 | | 0.000000 | +----------------+

print(y_test) +----------------+ | Critical Value | +----------------+ | 1.208531 | | 0.000000 | | 0.000000 | | 0.000000 | | 0.632653 | | 1.125000 | | 4.390356 | +----------------+

Then I performed Gradient Boosting Regressor in the following manner,

est_knc= GradientBoostingRegressor() est_knc.fit(X_train, y_train) pred = est_knc.score(X_test, y_test) print(pred)

and got the output, 0.8879530974429752

it's ok till here. Now I want to plot this but its quite confusing for me to understand what and how parameters do I have to pass in order to plot a scatter plot using the above data. I'm new in visualisation. :(

score 1 · Answer 1 · answered Aug 15 '19 at 06:27

1

Try out scatter plots for different data sets you created and deferant results you obtained. Then of course you will see the patterns.

Here is a code snippet I used for creating scatter plots. Hope it helps if you are new to visualization. Here I take inputs for x and y from two separate files as xdata.txt and ydata.txt. They should be simple files with the data you want to plot separated by new lines.

ie-

xdata.txt file
1.208531 
0.169742 
0.119691
0.151515 
0.632653
0.104000
3.313433

ydata.txt file
0.010000
0.010000
0.010000
0.010000
0.010000
1.125000
1.076923

but of course you can change this and create your own numpy arrays to get the data to plot in a convenient way.

import numpy as np
import matplotlib.pyplot as plt

x = np.fromfile("xdata.txt",float,-1," ")
y = np.fromfile("ydata.txt",float,-1," ")

plt.scatter(x, y,alpha=0.5)
plt.show()

If the imports are not working then you will have to install the required packages using pip.

answered Aug 15 '19 at 06:27

Himesh Sameera

149
1
13

Hi.. Thanks for your input. But is it possible that the accuracy score(X_test and y_test) could provide some points to plot on scatter chart. – sodmzs Aug 15 '19 at 06:46
I get your question now. You seem to have some 3d data. So to do 2d plots you need to select 2 of them. First try to plot fre vs critical_value, then loc vs critical_value. Plot all three "Training Data", "Testing Data" and "Output values from Regressor" in the same plot (probably in different colors). Then of course try 3D plotting like surface plots and you will see how accurate the regressor is. But I have a feeling that you will need some more data points to do a good job here. But you can first try with this 14 data points to understand the thing. – Himesh Sameera Aug 15 '19 at 07:22
Actually I think OP is not understanding that accuracy_score returns a scalar which, in a plot, would just be a point on an axis between 0 and 1. This is the correct answer to the question – pythonic833 Aug 15 '19 at 07:37
Since you have all the points above code can plot them in 2D. For 3D plots look for something like. https://stackoverflow.com/questions/1985856/how-to-make-a-3d-scatter-plot-in-python – Himesh Sameera Aug 15 '19 at 14:26

how to plot a scatter plot for the following data in python?

1 Answers1