3

I'm plotting a CSV file of weather data, and I got it to import just fine in my code, but i'm trying to plot it. Here's a sample of the CSV data:

12:00am,171,6,7,52,76,77.1,63.7,28.74,0.00,0.00,0.0,0,63.7,78.1,67.4,56.0,29.96
12:01am,192,4,6,52,76,77.1,63.7,28.74,0.00,0.00,0.0,0,63.7,78.1,67.4,56.0,29.96
12:02am,197,3,6,52,76,77.1,63.7,28.74,0.00,0.00,0.0,0,63.7,78.1,67.4,56.0,29.96
12:03am,175,3,6,52,76,77.1,63.7,28.73,0.00,0.00,0.0,0,63.7,78.1,67.4,56.0,29.96
12:04am,194,4,6,52,76,77.1,63.7,28.73,0.00,0.00,0.0,0,63.7,78.1,67.4,56.0,29.96
12:05am,148,5,6,52,76,77.1,63.7,28.73,0.00,0.00,0.0,0,63.7,78.1,67.4,56.0,29.96

Anyway, I'd like the time to be on the X axis, but I can't get it to plot using matplotlib. I tried a method using xticks, and it plotted my y values, but that was it. It just gave me a thick solid line on my X axis.

import matplotlib as mpl
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cbook as cbook
from matplotlib.dates import date2num
import datetime as DT
import re

data = np.genfromtxt('FILE.csv', delimiter=',', dtype=None, skip_header=3)
length = len(data)

x = data['f0']
y = data['f7']

fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.set_title("Temperature")    
ax1.set_xlabel('Time')
ax1.set_ylabel('Degrees')


#plt.plot_date(x, y)
plt.show()
leg = ax1.legend()

plt.show()

I'm missing a few crucial parts because I honestly don't know where to go from here. I checked the data type of my numpy array, and it kept saying numpy.ndarray, and I can't find a way to convert it to a string or an int value to plot. It's a 24 hour CSV file, and I would like tick marks every 30 minutes or so. Any ideas?

Lev Levitsky
  • 63,701
  • 20
  • 147
  • 175
  • [This question](http://stackoverflow.com/questions/1574088/plotting-time-in-python-with-matplotlib) is possible related. – Cody Piersall Jul 04 '13 at 20:26
  • Tried that, but I got a bunch of errors and it never plotted or output data. I tried this: http://stackoverflow.com/questions/6974847/plot-with-non-numerical-data-on-x-axis-for-ex-dates and I got just a solid black line on the x axis, probably because there is a good 600 tick marks. How would I change that? – user2551677 Jul 04 '13 at 20:49
  • I've had success with giving plt.plot() a list of datetime objects for the x coordinates and then a list of floats for the y values. I'm not sure what a convenient way to get that out of a numpy array would be, or how to really control the tick marks, but that might at least give you a chart. – seaotternerd Jul 04 '13 at 21:14
  • pandas is pretty good at importing csv data, Parsing good and has some basic plotting functionality. After examining data can go back to pure matplotlib functionality. – Joop Jul 04 '13 at 21:16
  • I've posted an answer. I tested it, and it works on my computer (always gotta have that disclaimer). – Cody Piersall Jul 04 '13 at 21:42
  • If my answer or @bmu's answer worked for you, please select one as the accepted answer. Otherwise, let me know how my answer _didn't_ work and I'll be glad to help. If you went with your own solution, please post it here and accept it. Note also that I changed my answer to work with Python 2.7 and all times. – Cody Piersall Jul 08 '13 at 13:33

2 Answers2

1

Well, this is not very elegant, but it works. The key is to change the times stored in x, which are just strings, to datetime objects so that matploblib can plot them. I have made a function that does the conversion and called it get_datetime_from_string.

** Edited code to be compatible with Python 2.7 and work with times with single digit hours **

import matplotlib as mpl
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cbook as cbook
from matplotlib.dates import date2num
import datetime as DT
import re

def get_datetime_from_string(time_string):
    ''' Returns a datetime.datetime object

        Args
        time_string: a string of the form 'xx:xxam'
        '''

    # there's got to be a better way to do this.
    # Convert it to utf-8 so string slicing works as expected.
    time_string = unicode(time_string, 'utf-8')

    # period is either am or pm
    colon_position = time_string.find(':')
    period = time_string[-2:]
    hour = int(time_string[:colon_position])
    if period.lower() == 'pm':
        hour += 12

    minute = int(time_string[colon_position + 1:colon_position + 3])

    return DT.datetime(1,1,1,hour, minute)

data = np.genfromtxt('test.csv', delimiter=',', dtype=None, skip_header=3)
length=len(data)

x=data['f0']
y=data['f7']

datetimes = [get_datetime_from_string(t) for t in x]

fig = plt.figure()

ax1 = fig.add_subplot(111)

ax1.set_title("Temperature")    
ax1.set_xlabel('Time')
ax1.set_ylabel('Degrees')

plt.plot(datetimes, y)
leg = ax1.legend()

plt.show()

I kept getting tripped up because I was trying to do string slicing on time_string before converting it to utf-8. Before it was giving me the ASCII values or something. I'm not sure why converting it helped, but it did.

Cody Piersall
  • 8,312
  • 2
  • 43
  • 57
  • When I add that to my code, I get the error:File "metogram.py", line 22, in get_datetime_from_string hour = int(time_string[:2]) ValueError: invalid literal for int() with base 10: '1:' – user2551677 Jul 04 '13 at 21:44
  • Realized I made a small mistake in adapting that to mine, and now the new error is Traceback (most recent call last): File "metogram.py", line 36, in datetimes = [get_datetime_from_string(t) for t in x] File "metogram.py", line 20, in get_datetime_from_string time_string = str(time_string, 'utf-8') TypeError: str() takes at most 1 argument (2 given) – user2551677 Jul 04 '13 at 21:51
  • Try it without the conversion. In other words, try it without the line `time_string = str(time_string, 'utf-8')`. – Cody Piersall Jul 04 '13 at 22:34
  • By the way, the reason that breaks for you and not for me is that I'm using Python version 3.3, and you're not :-). – Cody Piersall Jul 04 '13 at 22:36
  • I got Cody Piersall's code to work by changeing "time_string = str(time_string, 'utf-8')" to "time_string = unicode(time_string, 'utf-8')" and "plt.plot_date(datetimes, y)" to "plt.plot(datetimes, y)". I'm using Python 2.7. – rtrwalker Jul 04 '13 at 22:54
  • 1
    Your code works with the snippet of csv I provided, but not with the whole CSV since it breaks when it gets to times like 1:30am with one number in the hour column. – user2551677 Jul 05 '13 at 01:37
  • I changed the function to work with both types of times by checking for the position of the colon--note the addition of the `colon_position` variable in the `get_datetime_from_string` function. It should work now. – Cody Piersall Jul 05 '13 at 04:48
1

pandas is a very useful library for time series analysis and has some plotting features based on matplotlib.

Pandas uses dateutil internally to parse dates, however the problem is, that the date isn't included in your file. In the code below I assume, that you will know the date before parsing the file (from the file name?)

In [125]: import pandas as pd
In [126]: pd.options.display.mpl_style = 'default'
In [127]: import matplotlib.pyplot as plt

In [128]: class DateParser():                                          
   .....:     def __init__(self, datestring):
   .....:         self.datestring = datestring
   .....:     def get_datetime(self, time):    
   .....:         return dateutil.parser.parse(' '.join([self.datestring, time]))
   .....:     

In [129]: dp = DateParser('2013-01-01')

In [130]: df = pd.read_csv('weather_data.csv', sep=',', index_col=0, header=None,
                  parse_dates={'datetime':[0]}, date_parser=dp.get_datetime)

In [131]: df.ix[:, :12] # show the first columns
Out[131]: 
                      1   2   3   4   5     6     7      8   9   10  11  12  
datetime                                                                      
2013-01-01 00:00:00  171   6   7  52  76  77.1  63.7  28.74   0   0   0   0   
2013-01-01 00:01:00  192   4   6  52  76  77.1  63.7  28.74   0   0   0   0   
2013-01-01 00:02:00  197   3   6  52  76  77.1  63.7  28.74   0   0   0   0   
2013-01-01 00:03:00  175   3   6  52  76  77.1  63.7  28.73   0   0   0   0   
2013-01-01 00:04:00  194   4   6  52  76  77.1  63.7  28.73   0   0   0   0   
2013-01-01 00:05:00  148   5   6  52  76  77.1  63.7  28.73   0   0   0   0   

In [132]: ax = df.ix[:,1:3].plot(secondary_y=1)

In [133]: ax.margins(0.04)

In [134]: plt.tight_layout()

In [135]: plt.savefig('weather_data.png')

weather_data.png

bmu
  • 35,119
  • 13
  • 91
  • 108