3

Im importing a csv with Pandas in IPython. When displaying the DataFrame it looks like:


     2013    2012    2011    2010    2009    2008    2007    2006    2005
Jan  11,875  10,989  10,852  11,762  13,850  14,269  14,075  9,222   -
Feb  10,206  10,501  15,713  11,785  13,886  14,289  12,635  13,149  -
Mar  11,235  11,991  14,193  14,239  15,528  14,589  14,519  10,179  -
Apr  NaN     13,617  12,945  14,682  16,953  18,054  14,954  10,549  -
May  NaN     14,645  15,524  15,861  12,357  18,833  16,511  12,889  -
Jun  NaN     14,987  17,740  26,616  13,947  19,580  18,161  13,969  -
Jul  NaN     13,514  19,082  19,880  16,199  20,522  16,537  14,038  -
Aug  NaN     12,830  14,785  16,125  23,438  16,018  16,645  12,430  1,729
Sep  NaN     12,070  13,232  17,081  16,997  16,543  14,372  12,400  5,414
Oct  NaN     11,907  11,027  17,995  12,576  13,535  17,169  14,673  4,920
Nov  NaN     10,623  12,127  12,439  11,926  12,491  13,530  14,313  7,993
Dec  NaN     8,624   8,952   10,498  12,811  14,552  11,573  10,780  6,879
TOTAL    33,316  146,298     166,172     188,963     180,468     193,275     180,681     148,591     26,935

Now I want to plot the data in a graph, but no matter what I try I get "TypeError: Empty 'DataFrame': no numeric data to plot"

Obviously the DataFrame isn't empty, and is full of numbers. What am I missing? I was under the impression that Pandas identified numbers all on its own.

Ryhnn
  • 456
  • 1
  • 5
  • 16
  • I don't think the data in current form (as posted on your answer) is "numeric". I see "," in the data inside the numbers. Refer to: http://stackoverflow.com/questions/11858472/pandas-combine-string-and-int-columns or http://stackoverflow.com/questions/16643695/pandas-convert-strings-to-float-for-multiple-columns-in-dataframe – Nipun Batra Sep 09 '13 at 12:33
  • This is probably because you didn't read the data properly. Can you show your code (the `read_csv` command) you used to read it and (an excerpt of) the raw csv file? In `read_csv` you can specify the decimal seperator, as well as the values to consider as NaN. – joris Sep 09 '13 at 14:23

2 Answers2

3

Thanks for all the suggestions! It pointed me in the right direction. I managed to fix the issue with

df = df.replace(',', '', regex=True)
df = df.replace('-', 'NaN', regex=True).astype('float')
df.plot()
Ryhnn
  • 456
  • 1
  • 5
  • 16
  • It's nice you could solve it! But as I said in the other comment, this should not be necessary. It are things you can easily (and better) handle in the import step, with something like this ``pd.read_csv(file, decimal=',', na_values=['-'])`` with an appropriate ``sep=''`` value depending on your data. – joris Sep 09 '13 at 20:26
2

Taking your data, and replacing "," by ".", plus "-" by "NaN", it works:

>>> s="""     2013    2012    2011    2010    2009    2008    2007    2006    2005
Jan  11,875  10,989  10,852  11,762  13,850  14,269  14,075  9,222   -
Feb  10,206  10,501  15,713  11,785  13,886  14,289  12,635  13,149  -
Mar  11,235  11,991  14,193  14,239  15,528  14,589  14,519  10,179  -
Apr  NaN     13,617  12,945  14,682  16,953  18,054  14,954  10,549  -
May  NaN     14,645  15,524  15,861  12,357  18,833  16,511  12,889  -
Jun  NaN     14,987  17,740  26,616  13,947  19,580  18,161  13,969  -
Jul  NaN     13,514  19,082  19,880  16,199  20,522  16,537  14,038  -
Aug  NaN     12,830  14,785  16,125  23,438  16,018  16,645  12,430  1,729
Sep  NaN     12,070  13,232  17,081  16,997  16,543  14,372  12,400  5,414
Oct  NaN     11,907  11,027  17,995  12,576  13,535  17,169  14,673  4,920
Nov  NaN     10,623  12,127  12,439  11,926  12,491  13,530  14,313  7,993
Dec  NaN     8,624   8,952   10,498  12,811  14,552  11,573  10,780  6,879
TOTAL    33,316  146,298     166,172     188,963     180,468     193,275     180,681     148,591     26,935"""

>>> s=s.replace(',','.')    
>>> s=s.replace('-','NaN')    
>>> df=pd.read_csv(StringIO(s), sep='\s*')
>>> df.plot()
<matplotlib.axes.AxesSubplot at 0x88a4790>

Something interesting is that, from the read_csv docstring, there is an argument specifying the decimal separator, but it doesn't seems to work on my version (0.11.0).

Nic
  • 3,365
  • 3
  • 20
  • 31