1

I'm using matplotlib.dates to convert my string dates into date objects thinking it would be easier to manipulate later.

import matplotlib.dates as md    
def ConvertDate(datestr):
    '''
    Convert string date into matplotlib date object
    '''
    datefloat = md.datestr2num(datestr)
    return md.num2date(datefloat)

What I was trying to do was filter my structured array to tell me the index numbers of rows belong to a certain month and/or year

import numpy as np
np.where( data['date'] == 2008 )

I can probably use a lambda function to convert each object into string value like so

lambda x: x.strftime('%Y')

to compare each item but I dont know where to put this lambda function into np.where or if its even possible.

Any ideas? Or is there some better way to do this?

kentwait
  • 1,969
  • 2
  • 21
  • 42

2 Answers2

1

Note: you might as well use datetime's datetime.strptime function:

import datetime
import numpy as np
dt1 = datetime.datetime.strptime('1/2/2012', '%d/%m/%Y')
dt2 = datetime.datetime.strptime('1/2/2011', '%d/%m/%Y')

In [5]: dt1
Out[5]: datetime.datetime(2012, 2, 1, 0, 0)

You can then use numpy.non-zero (to filter your array to the indices of those datetimes where, for example, year is 2012):

a = np.array([dt1, dt2])
b = np.array(map(lambda x: x.year, a))

In [8]: b
Out[8]: array([2012, 2011], dtype=bool)

In [9]: np.nonzero(b==2012)
Out[9]: (array([0]),)

Also, I would suggest looking into which has this functionality built-in (on top of numpy), many more convenience functions (e.g. to_datetime), as well as efficient datetime storage...

Community
  • 1
  • 1
Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
  • So I think I did the opposite: I created a new array that converted the object into text `newArray = np.array(map(lambda x:x.strftime('%Y'),data['date']))` and did the matching on that `matches = np.where( newArray == '2008')` Therefore the match indexes are also the indexes of the original array – kentwait Dec 17 '12 at 11:12
  • My code does exactly the same thing? But I think it should be more efficient to use `x.year == 2008` (since it avoid calling `strftime`). – Andy Hayden Dec 17 '12 at 11:25
  • Sorry I'm confused with `np.where(lambda x: x.year == 2012)` part. Shouldn't the `lambda` function map to `dt` when you create a new array? Anyway you are right that pandas may be better for my purpose. – kentwait Dec 17 '12 at 11:38
  • Sorry, didn't properly verify what I was doing! There must be a way without using map... – Andy Hayden Dec 17 '12 at 12:24
1

After a lot of error messages, I think I found an answer to my own question.

[ x for x in range(len(data)) if data['date'][x].year == 2008 ]

I did a list comprehension to return the indexes of the structured array that matched a query. I also included @hayden's suggestion to use .year instead of strftime() Maybe numpy.where() is still faster but this suits my needs right now.

kentwait
  • 1,969
  • 2
  • 21
  • 42