6

First of all, this question is not the same as this one.

The problem I'm having is that when I try to plot a DataFrame which contains a numpy NaN in one cell, I get an error:

C:\>\Python33x86\python.exe
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>>
>>> dates = pd.date_range('20131201', periods=5, freq='H')
>>> data = [[1, 2], [4, 5], [9, np.nan], [16, 17], [25, 26]]
>>> df = pd.DataFrame(data, index=dates,
...                       columns=list('AB'))
>>>
>>> print(df.to_string())
                      A   B
2013-12-01 00:00:00   1   2
2013-12-01 01:00:00   4   5
2013-12-01 02:00:00   9 NaN
2013-12-01 03:00:00  16  17
2013-12-01 04:00:00  25  26
>>> df.plot()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1636, in plot_frame
    plot_obj.generate()
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 856, in generate
    self._make_plot()
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1240, in _make_plot
    self._make_ts_plot(data, **self.kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1321, in _make_ts_plot
    _plot(data[col], i, ax, label, style, **kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1295, in _plot
    style=style, **kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tseries\plotting.py", line 77, in tsplot
    lines = plotf(ax, *args, **kwargs)
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 4139, in plot
    for line in self._get_lines(*args, **kwargs):
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 319, in _grab_next_args
    for seg in self._plot_args(remaining, kwargs):
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 297, in _plot_args
    x, y = self._xy_from_xy(x, y)
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 216, in _xy_from_xy
    by = self.axes.yaxis.update_units(y)
  File "C:\Python33x86\lib\site-packages\matplotlib\axis.py", line 1337, in update_units
    converter = munits.registry.get_converter(data)
  File "C:\Python33x86\lib\site-packages\matplotlib\units.py", line 137, in get_converter
    xravel = x.ravel()
  File "C:\Python33x86\lib\site-packages\numpy\ma\core.py", line 3969, in ravel
    r._mask = ndarray.ravel(self._mask).reshape(r.shape)
  File "C:\Python33x86\lib\site-packages\pandas\core\series.py", line 981, in reshape
    return ndarray.reshape(self, newshape, order)
TypeError: an integer is required

The above code works if I replace the np.NaN with a number, such as "2.3".

Plotting as two separate Series does not work either (it fails when I add the Series containing the NaN to the plot):

C:\>\Python33x86\python.exe
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>>
>>> dates = pd.date_range('20131201', periods=5, freq='H')
>>> data = [[1, 2], [4, 5], [9, np.nan], [16, 17], [25, 26]]
>>> df = pd.DataFrame(data, index=dates,
...                       columns=list('AB'))
>>>
>>> print(df.to_string())
                      A   B
2013-12-01 00:00:00   1   2
2013-12-01 01:00:00   4   5
2013-12-01 02:00:00   9 NaN
2013-12-01 03:00:00  16  17
2013-12-01 04:00:00  25  26
>>> df['A'].plot(label='This is A', style='k')
<matplotlib.axes.AxesSubplot object at 0x02ACFF90>
>>> df['B'].plot(label='This is B', style='g')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1730, in plot_series
    plot_obj.generate()
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 856, in generate
    self._make_plot()
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1240, in _make_plot
    self._make_ts_plot(data, **self.kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1311, in _make_ts_plot
    _plot(data, 0, ax, label, self.style, **kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tools\plotting.py", line 1295, in _plot
    style=style, **kwds)
  File "C:\Python33x86\lib\site-packages\pandas\tseries\plotting.py", line 77, in tsplot
    lines = plotf(ax, *args, **kwargs)
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 4139, in plot
    for line in self._get_lines(*args, **kwargs):
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 319, in _grab_next_args
    for seg in self._plot_args(remaining, kwargs):
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 297, in _plot_args
    x, y = self._xy_from_xy(x, y)
  File "C:\Python33x86\lib\site-packages\matplotlib\axes.py", line 216, in _xy_from_xy
    by = self.axes.yaxis.update_units(y)
  File "C:\Python33x86\lib\site-packages\matplotlib\axis.py", line 1337, in update_units
    converter = munits.registry.get_converter(data)
  File "C:\Python33x86\lib\site-packages\matplotlib\units.py", line 137, in get_converter
    xravel = x.ravel()
  File "C:\Python33x86\lib\site-packages\numpy\ma\core.py", line 3969, in ravel
    r._mask = ndarray.ravel(self._mask).reshape(r.shape)
  File "C:\Python33x86\lib\site-packages\pandas\core\series.py", line 981, in reshape
    return ndarray.reshape(self, newshape, order)
TypeError: an integer is required

However, if I do this directly with Matplotlib's Pyplot plot(), instead of using Pandas' plot() function, it works:

C:\>\Python33x86\python.exe
Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> dates = pd.date_range('20131201', periods=5, freq='H')
>>> plt.plot(dates, [1, 4, 9, 16, 25], 'k', dates, [2, 5, np.NAN, 17, 26], 'g')
[<matplotlib.lines.Line2D object at 0x03E98650>, <matplotlib.lines.Line2D object at 0x040929B0>]
>>> plt.show()
>>>

So it seems that I have a workaround, but as I plot large DataFrames, I would prefer to use Pandas' plot() method, which is more convenient. I've tried to follow the stack trace, but after a while it gets complicated (I'm not familiar with Pandas, Numpy and Matplotlib source code). Am I doing something wrong, or is this a possible bug in Pandas' plot()?

Thank you for your help!

I tried both on Windows x86 and on Linux AMD64 with the same results with these versions:

  • Python 3.3.2
  • Pandas 0.12.0
  • Matplotlib 1.3.1
  • Numpy 1.7.1
Community
  • 1
  • 1
Nicolas G
  • 123
  • 2
  • 8
  • 1
    I'm having the exact issue as well ! – erantdo Dec 13 '13 at 11:13
  • can you debug `newshape` and `order` values when error happens? – alko Dec 13 '13 at 11:14
  • newshape = tuple: (5,); order = str: C – Nicolas G Dec 13 '13 at 12:48
  • @alko, I saw your comment about this working on Python 2.something, so I just installed abd tried with Python 2.7.6 in Windows x86 32 bit with the same versions of the modules ==> same error. Also, with the newest Numpy (1.8.0) ==> same error. I see you have removed your comment, so I guess it's not working on your side after all...? – Nicolas G Dec 13 '13 at 13:36
  • @Iye-Nico I removed comment as it was incorrect in regard to versions. I forgot to check matplotlib. Your code works for me with py 2.7, np 1.8.0, matplotlib 1.2.0, pd 1.12.0; ah, and it seems works for np 1.7.0 – alko Dec 13 '13 at 14:35
  • I can't reproduce this, but has anyone filed an issue on github about this? – jseabold Jan 28 '14 at 19:00
  • @jseabold, I assumed the post form Jonathan March (see the accepted answer) was related to a reported bug. But after checking again I don't see any reference to a bug in his post, so maybe nobody has actually reported this. I'll have a look on pandas' github and create the the report if the issue is missing. – Nicolas G Feb 25 '14 at 15:58
  • @Iye-Nico I already filed one. The issue seemed to have already been fixed though. https://github.com/pydata/pandas/issues/6183 – jseabold Feb 26 '14 at 17:16

2 Answers2

2

It seems this is matplotlib 1.3.1 with pandas 0.12 integration bug:

The workaround is to downgrade to matplotlib 1.3.0. (Note, however, that this version of matplotlib contains a bug on systems which have fonts with non-ASCII font names, so you may need to pick your problem!). This downgrade will trigger a downgrade to numpy 1.7.1, so you should then (again) upgrade to numpy 1.8.0. This error should be fixed in the upcoming Pandas 0.13. However Pandas 0.13 may break some existing code (because pandas.Series is no longer a subclass of numpy.ndarray), so again, some hard choices may be required, at least in the short term.

Just checked, code works fine with matplotlib 1.3.0:

>>> import matplotlib
>>> matplotlib.__version__
'1.3.0'
>>> df.plot()
<matplotlib.axes.AxesSubplot object at 0x04E8B4F0>
>>> plt.show(_)

enter image description here

alko
  • 46,136
  • 12
  • 94
  • 102
  • I confirm this works with Python 2.7.6, Matplotlib 1.3.0, NumPy 1.8.0 and Pandas 0.12.0. It also works with Python 3.3.2, Matplotlib 1.3.0, NumPy 1.7.1 and Pandas 0.12.0. Thank you @alko for your snappy answers, pointing out Jonathan March's post and testing on your side. – Nicolas G Dec 13 '13 at 15:44
2

I workaround the problem with following:

fig, ax = plt.subplots()
ax.plot(df)
william
  • 31
  • 3
  • This solved it for me, too. Thanks @william! Much better than trying to upgrade/downgrade versions of pandas and numpy. – SPKoder Sep 29 '15 at 14:13
  • To preserve x-axis values...```fig, ax = plt.subplots(); ax.plot(df.index, df)``` – SPKoder Sep 29 '15 at 14:27