4

I'm printing various things with pandas, using the inbuilt plot command, such as my_dataframe.plot() followed by plt.show() in ipython.

Now this is a very easy and convenient way to visualize stuff, and given that I do postprocessing on the SVG file anyways, I don't care too much about details of the plot.

However, I need a title, a legend and labels for the x and y axis on the plot, both as a reminder to myself about what is what, and for quickly sending some stuff to other folks, without having to add a "Oh, and BTW, the x axis is hours this time, y is meters as always, but blue now is sample B…" line to the email.

I figured out how to do this in an easy way (see below) and I'm also aware about the various powerful things I could do with ax, but it took me a while to get to my "easy" solution, and I'm staying away from ax because there is way too much stuff going on that I neither need nor understand.

I do understand why one would want all the powerful options of ax, but I do not understand why such a simply option is not included in pandas plotting function. And it seems like I'm not the only one. User Chrispy for example posted this highly rated comment:

is there a particular reason why x and y labels can't be added as arguments to pd.plot()? Given the additional concision of pd.plot() over plt.plot() it seems it would make sense to make it even more succinct instead of having to call ax.set_ylabel()

on the answer to this question, but got no further comments. Hence I'm blatantly stealing this question.

Why does plt.plot() include legends by default and also easily lets me add a title (my_df.plot(title = 'check out my cool plot')), but the logical next step (my_df.plot(ylabel = 'size in meters')) results in TypeError: There is no Line2D property "ylabel"?

Am I missing something or is there a reason for this oversight?

Example code:

This works when I implement it in my real file and run it with run workflow.py in ipython, but I cant reproduce it when copy-pasting the code. Either my labels get ignored, or it downright fails:

EDIT:

Originally I had plt.xlabel = 'time in seconds' in my example here, which did not work, but I had the correct plt.xlabel('time in seconds') in my real code, which of course did work.

times = np.arange(0,43200,60)
my_df = pd.DataFrame(np.random.randn(len(times)), index = times)
my_df.plot(title = 'just some random data')    #this works
#my_df.plot(title = 'just some random data', ylabel = 'size in meters', xlabel = 'time in seconds')    #this seems like the logical next step, but it errors
plt.ylabel('size in meters')
plt.xlabel('time in seconds')

This seems like the most easy/minimal solution with axes, using @Johannes solution, but I think this (see comments to the answer) is also a good illustration why I would like to not having to bother myself with axes:

axes = my_df.plot(title = 'just some random data')
axes.set_ylabel('size in meters')
axes.set_xlabel('time in seconds')

Also, I can set the title another way, but there is only one option for the labels, which is what's confusing me:

axes = my_df.plot()
axes.set_title('just some random data')
axes.set_ylabel('size in meters')
axes.set_xlabel('time in seconds')
JC_CL
  • 2,346
  • 6
  • 23
  • 36

2 Answers2

6

First of all, there is no particular reason for the pandas plotting command not to also include keyword arguments for the labels, just as for the title.
This option could well be implemented, but it isn't. To speculate about the reasons will not lead anywhere but there is an issue about it at the pandas issue tracker.

Concerning the actual problem, there are several ways to set labels to the axes. Three possible methods are listed below. Note that in the question as well as in the other answer there appear some non working methods.

Especially ax.xlabel() does not exist. Neither does plt.ylabel = 'size in meters' make any sense, since it overwrites the ylabel method instead of using it.

Working options:

ax.set_xlabel()

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

times = np.arange(0,43200,60)
my_df = pd.DataFrame(np.random.randn(len(times)), index = times)
ax = my_df.plot(title = 'just some random data')

ax.set_ylabel('size in meters')
ax.set_xlabel('time in seconds')

plt.show()

ax.set()

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

times = np.arange(0,43200,60)
my_df = pd.DataFrame(np.random.randn(len(times)), index = times)
ax = my_df.plot(title = 'just some random data')
ax.set(xlabel='time in seconds', ylabel='size in meters')

plt.show()

plt.xlabel()

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

times = np.arange(0,43200,60)
my_df = pd.DataFrame(np.random.randn(len(times)), index = times)
my_df.plot(title = 'just some random data')

plt.ylabel('size in meters')
plt.xlabel('time in seconds')

plt.show()

plt.setp()

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

times = np.arange(0,43200,60)
my_df = pd.DataFrame(np.random.randn(len(times)), index = times)
ax = my_df.plot(title = 'just some random data')

plt.setp(ax,xlabel='time in seconds', ylabel='size in meters')

plt.show()
ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
  • I see. I figured there might be a reason that I am simply overlooking, but seems like there is none. Maybe I'll suggest it on github, if I can find the time and my login data for it. The `plt.ylabel = 'size in meters'` probably sneaked in while going back and forth between my real script and the MWE, due to the confusing that caused me to ask the question in the first place. – JC_CL Jul 07 '17 at 08:16
  • 1
    As you can see [here](https://github.com/pandas-dev/pandas/issues/9093) that is no new issue. – ImportanceOfBeingErnest Jul 07 '17 at 08:45
  • If you are plotting a pandas series the ```plt.setp()``` works – eemilk Dec 21 '21 at 08:32
0

df.plot() returns a list of Axes objects (one for each subplot). Those have .set_xlabel() and .set_ylabel() methods.

Do something like this:

times = np.arange(0,43200,60)
my_df = pd.DataFrame(np.random.randn(len(times)), index = times)
axes = my_df.plot(title = 'just some random data')
axes[0].ylabel('size in meters')
axes[0].xlabel('time in seconds')

Plots are no objects, instead the plot function just creates Line objects. Since you can have multiple lines in a single Axes object but only one pair of labels, it makes sense that the labels are properties of the axes and not of the lines.

Johannes
  • 3,300
  • 2
  • 20
  • 35
  • But can't I also have multiple labels (e.g. secondary y-axis)? And I can't set multiple titles anyways, so why not also allow me to set the labels this way? And your example ends with `TypeError: 'AxesSubplot' object does not support indexing` for me. Shouldn't it be `axes.set_ylabel =...`? And I can also use `axes.set_title=...`, so it would be logical to assume that it also behaves the same for `pd.plot()` – JC_CL Jul 05 '17 at 10:04
  • You can have multiple y-axis, but not more than two and its implemented as a special case. And it looks like `plot` returns a single `Axes` object instead of a one-element-list when there is only one subplot. Try to omit the `[0]` and just go for `axes.ylabel('size in meters')` etc. – Johannes Jul 05 '17 at 10:15
  • Omitting the `[0]` just does not plot anything. `axes.set_ylabel = 'size in meters'` does work however. – JC_CL Jul 05 '17 at 11:01