2

I can plot data from a CSV file with the following code:

import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('test0.csv',delimiter='; ', engine='python')
df.plot(x='Column1', y='Column3')
plt.show()

But I don't understand one thing. How plt.show() knows about df? I'll make more sense to me seeing, somewhere, an expression like:

plt = something(df)

I have to mention I'm just learning Python.

KcFnMi
  • 5,516
  • 10
  • 62
  • 136
  • Basically `df` also imports `plt` and modifies it directly. `plt` uses a lot of global state which is accessible to users and code alike. – MisterMiyagi Oct 08 '16 at 13:13

3 Answers3

3

Matplotlib has two "interfaces": a Matlab-style interface and an object-oriented interface.

Plotting with the Matlab-style interface looks like this:

import matplotlib.pyplot as plt
plt.plot(x, y)
plt.show()

The call to plt.plot implicitly creates a figure and an axes on which to draw. The call to plt.show displays all figures.

Pandas is supporting the Matlab-style interface by implicitly creating a figure and axes for you when df.plot(x='Column1', y='Column3') is called.

Pandas can also use the more flexible object-oriented interface, in which case your code would look like this:

import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('test0.csv',delimiter='; ', engine='python')
fig, ax = plt.subplots()
df.plot(ax=ax, x='Column1', y='Column3')
plt.show()

Here the axes, ax, is explicitly created and passed to df.plot, which then calls ax.plot under the hood.

One case where the object-oriented interface is useful is when you wish to use df.plot more than once while still drawing on the same axes:

fig, ax = plt.subplots()
df.plot(ax=ax, x='Column1', y='Column3')
df2.plot(ax=ax, x='Column2', y='Column4')
plt.show()
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • What's the meaning of `fig, ax = plt.subplots()`? I was expecting just one thing at the left side. – KcFnMi Oct 08 '16 at 12:55
  • The figure is responsible for the entire image displayed. The axes is responsible for a plotting area. A figure can contain many axes. See http://stackoverflow.com/a/14846126/190597 for more on how matplotlib's object hierarchy is organized. – unutbu Oct 08 '16 at 13:01
  • 1
    @KcFnMi `plt.subplots` returns two objects, which can either be assigned to one name to create a tuple, or assigned to two names separately. – MisterMiyagi Oct 08 '16 at 13:16
2

From the pandas docs on plotting:

The plot method on Series and DataFrame is just a simple wrapper around :meth:plt.plot() <matplotlib.axes.Axes.plot>

So as is, the df.plot method is an highlevel call to plt.plot (using a wrapper), and thereafter, calling plt.show will simply:

display all figures and block until the figures have been closed

as it would with for all figures plotted with plt.plot.


Therefore, you don't see plt = something(df) as you would expect, because matpotlib.pyplot.plot is being called behind the scene by df.plot.

Moses Koledoye
  • 77,341
  • 8
  • 133
  • 139
  • 1
    I got it. But I would like to ask, is there someway to write the code so that magic becomes more explicit? – KcFnMi Oct 08 '16 at 12:39
  • @KcFnMi I don't think you need the explicit magic. That wrapper saves us the trouble of passing the required args and kwargs to `plt.plot` as required for sometimes complicated dataframes. – Moses Koledoye Oct 08 '16 at 12:43
1

According to http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.show , the plt.show() itself doesn't know about the data, you need to pass the data as parameters.

What you are seeing should be the plot of pandas library, according to the usage http://pandas.pydata.org/pandas-docs/stable/visualization.html#basic-plotting-plot.

Hope this solves your question.

Zixian Cai
  • 945
  • 1
  • 10
  • 17