0

I have a pandas dataframe like so:

Date     Allotment  NDII_Mean  NDVI_Mean  RGR_Mean  SATVI_Mean            
1984137   Arnston  -0.053650   0.414868  0.938309    0.332712   
1984185   Arnston   0.074928   0.558329  0.867951    0.334555   
1984249   Arnston  -0.124691   0.352225  1.041513    0.331821   
1985139   Arnston  -0.075537   0.468092  0.929414    0.383750   
1985171   Arnston   0.017400   0.493443  0.889835    0.314717   
1986206   Annex     0.151539   0.626690  0.775202    0.332507   
1986238   Annex     0.142235   0.604764  0.823083    0.303600   
1987241   Annex    -0.005423   0.506760  0.911124    0.338675   
1987257   Annex    -0.058166   0.449052  0.961348    0.336879

I want to plot based on allotment, so I will need to use groupby. So for each Allotment I want the Date on the X-Axis, and all four of the columns with mean in the name shown as lines on the graph and their values on the Y-Axis. I will then save them as pdf's, although it is not necessary that I do if someone knows another way. I can plot ONE value using this code (I'll lot NDII_Mean in this example) but I want to plot all four of the columns not just one. The code I am using is:

import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages

df=pd.read_csv('C:/Users/Stefano/Documents/Hurst_plots.csv')

with PdfPages('C:/Users/Stefano/Documents/Hurst_plots.pdf') as pdf:
   for i, group in df.groupby('Allotment'):

        plt.ylabel('Values')
        plt.figure()

        Hurst_plots=group.plot(x='Date', y='NDII_Mean',title=str(i)).get_figure()
        pdf.savefig(Hurst_plots) 

This is what one of the plots looks like (different than data shown because I shortened my example table):

enter image description here

edit:

This works by adding editing this line

Hurst_plots=group.plot(x='Date', y=['NDII_Mean', 'RGR_Mean', 'SATVI_Mean', 'SWIR32_Mean'],title=str(i)).get_figure()

enter image description here

but does anyone know how to take the legend outside of the graph completely?

Stefano Potter
  • 3,467
  • 10
  • 45
  • 82

1 Answers1

1

I have had mixed experiences using pandas to make graphics, most of the time I end up pulling the columns out of the dataframe as numpy arrays and using that to plot with matplotlib directly. Personally I feel like I have more control over plots using matplotlib itself for things like styling plots, line colors (often I find myself dynamically generating RGB triplets based on some calculation), and controlling legends! I'd first recommend searching thru the matplotlib documentation, try searching for multi-line plot in matplotlib. It looks like you are trying to plot timeseries. Matplotlib has nice (albeit a little confusing at first) interface for handling dates so once you figure out how things work you can customize to your liking.

Here is a snippet of something I used recently to generate a multi line timeseries plot, note the use of this recently added style feature it makes the plots look very nice. Taken from an ipython notebook.

import pandas as pd
from matplotlib import pyplot as plt
from matplotlib import dates as mdates
%matplotlib inline
import datetime
from datetime import datetime as dt

plt.style.use('fivethirtyeight')

months = mdates.MonthLocator(range(1,13), bymonthday=1)
monthsFmt = mdates.DateFormatter('%b')

fig, ax = plt.subplots()
plt.hold(True)
for year in range(2010,2016):
    vals = dfs[str(year)]['my_awesome_points'].cumsum().values
    adjusted_d_obj = ['2014'+x[4:] for x in dfs[str(year)]['date']]
    date_objs = [dt.strptime(x, '%Y-%m-%d') for x in adjusted_d_obj]
    dates = [mdates.date2num(x) for x in date_objs]
    #dateints = range(len(dates))
    if year == 2015:    
        ax.plot_date(dates, vals, '-', color=colors[str(year)],                          label=str(year))
    else:
        ax.plot_date(dates,vals, 'r-', color=colors[str(year)],    label=str(year), alpha=0.4)
fig.set_size_inches((14,10))
fig.set_dpi(800)

ax.grid(True)
fig.autofmt_xdate()
ax.xaxis.set_major_locator(months)
ax.xaxis.set_major_formatter(monthsFmt)

plt.savefig('sick_pick.png')

It makes a graph looking something like

decent looking matplotlib!

In my case I have a pre-existing dictionary of dataframes where each is accessed by year as the key. Saving to PDF is possible, saving to PNG image file is easier I think as shown above the plt.savefig('filename.png') should work. The PDF functionality is definitely cool. I've had nagging clients (with no clue what they are talking about) ask for reports/charts etc. You can set up a loop and write a PDF with hundreds and thousands of pages where each page is a nicely formatted chart with a title, legend, axis labels etc. Traditionally the complaint against matplotlib was the dry and generic looking graphs. The new matplotlib styles are very easy on the eye!

EDIT: Check this awesome answer to solve your legend issues. https://stackoverflow.com/a/4701285/2639868

Community
  • 1
  • 1
chill_turner
  • 499
  • 4
  • 6