1

I'm new to seaborn but I can't figure out what I'm doing wrong. I have the following data in a DF.

    value   predicted   periodDate
51056   6482000.0   14845572.0  2019
51057   5347000.0   15999591.0  2018
51058   4912000.0   12067500.0  2017
51059   8490000.0   16376355.0  2016
51060   6998000.0   13886005.0  2015
51061   7868000.0   23012226.0  2014
51062   8068000.0   14297749.0  2013
51063   8427000.0   18183418.0  2012
51064   10229000.0  18053788.0  2011
51065   10504000.0  19222080.0  2010

I want a plot with each value ploted beside the other, sorted by year(periodDate column).

I tried to use this command:

sn.factorplot(x="value", y="predicted", data=dataToPlot)

But I get this output: enter image description here

Whereas I'm hoping for something like this: enter image description here

Can anyone help or suggest a resource I can use to understand what I am doing incorrectly with my seaborn command?

Lostsoul
  • 25,013
  • 48
  • 144
  • 239
  • If I understand correctly, you want to years to be different colors? – Grayrigel Oct 09 '20 at 23:24
  • You need an extra column to groupby that so that you can assign hue with that column. Here you have 10 records and you will get a graph with 10 points how do expect to have these many curves? – Mehdi Golzadeh Oct 09 '20 at 23:32
  • @Grayrigel no..just X axis to be year so in my example above, 1,3,5,7 would be the years. – Lostsoul Oct 09 '20 at 23:34
  • @MhDG7 I saw a option for hue, I can try that. It didn't work..i want two bars/lines for each column and just sorted by year. – Lostsoul Oct 09 '20 at 23:35
  • Okay. I was unsure what you are really wanted. You could simply use `pandas` for plotting. – Grayrigel Oct 10 '20 at 01:25

2 Answers2

1

Well making visualizations is so much easier by defining the chart dimensions in first place. In your case they'll be:

  • x : periodDate
  • y : numeric values
  • z (or something) : value or predicted

Actually the value/predicted columns are in your data frame but not in a useful way. This brings us to the next point.

In order to arrange your dataframe thinking in the three dimensions that we enumerated (x, y, z). We'll use the pandas melt (unpivot) function.

df_aranged = df.melt(id_vars=['year'], var_name='z', value_name='z_value') # df is your dataframe

Now your dataframe looks like:

Aranged df

Now you can plot what you need. I used seaborn lineplot but you can use the best chart for you. To get your plot a little bit fancier check this answer. Also if you want to know more about seaborn plots check this reference.

sns.lineplot(x="periodDate", y="z_value", hue="z", data=df_aranged)

Final visualization

I hope this helps you.

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Hans
  • 26
  • 2
1

Little late, however, if plotting with pandas is an option, solution suggested by @Hans can be achieved very quickly without any preprocessing:

import matplotlib.pyplot as plt

#lineplot
df.plot(x='periodDate')
plt.tight_layout()
plt.show()

enter image description here

You can also use seaborn plotting with pandas:

import seaborn as sns
import matplotlib.pyplot as plt
sns.set_style('darkgrid')

#lineplot
df.plot(x='periodDate')
plt.tight_layout()
plt.show()

enter image description here

Or a barplot:

import matplotlib.pyplot as plt

#barplot
df.sort_values('periodDate').plot.bar(x='periodDate')
plt.tight_layout()
plt.show()

enter image description here

Grayrigel
  • 3,474
  • 5
  • 14
  • 32