0

I have a dataframe which is of the following structure:

A          B
Location1  1
Location2  2
1          3
2          4

In the above example column A is the index. I am attempting to produce a scatter plot using the index and column B. This data frame is made by resampling and averaging another dataframe like so:

df = df.groupby("A").mean()

Now obviously this sets the index equal to column A and I can plot it using the following which is adapted from here. Use index in pandas to plot data

df.reset_index().plot(x = "A",y = "B",kind="scatter", figsize=(10,10))

Now when I run this it returns the follow:

ValueError: scatter requires x column to be numeric

As the index column is intended to be a column of strings for which I can plot a scatter plot how can I go about fixing this?

cd123
  • 511
  • 1
  • 5
  • 15
  • I don't quite understand. If you have strings like 'Location1' in col A then how do you expect them to be plotted? – 9dogs Mar 16 '18 at 10:31
  • Just as a standard scatter plot with the values in A on the x and B values on the Y. – cd123 Mar 16 '18 at 10:37

2 Answers2

0

You may want to select only the integer rows:

import pandas as pd

d = {'A': ["Location1", "Location2", 1, 2], 'B': [1, 2, 3, 4]}
df = pd.DataFrame(data=d)
df_numeric = df[pd.to_numeric(df.A, errors='coerce').notnull()]

print(df_numeric)

   A  B
2  1  3
3  2  4

Grouped by A:

df_numeric_grouped_by_A = df_numeric.groupby("A").mean()

print(df_numeric_grouped_by_A)

   B
A   
1  3
2  4
Ferit
  • 558
  • 1
  • 5
  • 19
  • Whilst this is valid on it's own, it's not what I'm looking for. I need to have the Location1 and Location2 strings as x values which can be plotted. – cd123 Mar 16 '18 at 11:19
0

You may have to transponse the DataFrame, so that you have the index(Column A) as columnnames and then calculate the mean of the columns and plot them.

Lukas Humpe
  • 421
  • 3
  • 10