0

I am trying to use the scatter view in matplotlib on a single column dataframe like so:

uva1pd.plot(kind='scatter', y='RESULT')

This is the dataframe:

      RESULT
0    2009.13
1    1999.19
2    2014.34
3    1987.51
4    1987.51
..       ...
475  1999.35
476  1987.51
477  1993.19
478  1993.19
479  1982.62

However I am getting the following error:

An error was encountered:
scatter requires an x and y column

Is there a way to just use the default rownumber of index of the dataframe in matplotlib?

thentangler
  • 1,048
  • 2
  • 12
  • 38

2 Answers2

1

You can simply define a dummy x variable with the length equal to the y column -

y = np.random.randint(0,20,size=(10,))
x = np.arange(0,len(y)) #dummy x
plt.scatter(x,y)

enter image description here

Akshay Sehgal
  • 18,741
  • 3
  • 21
  • 51
  • @ Akshay Sehgal Thank you for this solution. I did not mention in my question, but I am converting the data frame from RDD to panda and it is compute-intensive. So I am trying to avoid adding more columns/data. – thentangler Aug 24 '20 at 16:34
  • 1
    My approach doesn't add any new columns. I am creating 2 independent numpy arrays. Infact my approach doesnt need you to convert your data into pandas atall, you can simply keep the array of data in y and use x as in code to generate the graph, WITHOUT conversion to pandas. Therefore it will be way faster than a pandas approach. – Akshay Sehgal Aug 24 '20 at 16:36
  • That's interesting. Would I be able to use it while the data is in RDD form, instead of using `collect()` ? – thentangler Aug 24 '20 at 16:45
  • 1
    Yes, just do `a = np.array(testRdd.collect())` . The `.collect()` already returns an array which can be set to a numpy array, without the heavy dataframe transformation. – Akshay Sehgal Aug 24 '20 at 19:08
1

Here's one way to do it in which you wouldn't have to define a new variable...

import matplotlib.pyplot as plt
import pandas as pd

d = pd.DataFrame({'data':[5,89,7,1,56,8]})

plt.scatter(d.index, d['data'])
plt.show()

user32882
  • 5,094
  • 5
  • 43
  • 82