2

I am looking to convert the string "september 20 2010" to a python datetime.date object using arrow.

I have written functions to replace portions of the text and ended up with 9/20/2016 but I want YYYY-MM-DD format and can't seem to get arrow to recognise my string and convert it to a python datetime.date object (without any time).

What has worked and what hasn't.

arrow.get('september 20 2010', '%B %d %Y')

this doesn't work for me I get an error: ParserError: Failed to match '%B %(?P<d>[1-7]) %Y' when parsing the string "september 20 2010"

However when I manipulate the string and then use arrow.Arrow(y,m,d).date(), the result is a datetime.date(2016, 9, 20) object.

I just can't convert it to any other format using .format('dddd-DD-MMMM-YYYY') which would return Monday 20 Septemb 2010.

Gordon Gustafson
  • 40,133
  • 25
  • 115
  • 157
yoshiserry
  • 20,175
  • 35
  • 77
  • 104

1 Answers1

4

Using arrow, you have to match the exact syntax of your string, here is the list of the associated token.

arrow.get('September 20 2010', 'MMMM D YYYY')

Note: In this very case, there is only one D because it cover the number with one or two digits 1, 2, 3... 29, 30 while DD cover the number with two digits only 01, 02, 03 ... 29, 30

Once you get your arrow object, you can display it however you like using format() :

ar = arrow.get('September 20 2010', 'MMMM D YYYY')
print(ar.format('YYYY-MM-DD')) # 2010-09-20

EDIT

To answer your comment, ar is an Arrow object and you can check every method it contained with dir

Arrow have a method date() which returns a datetime.date object.

Now, if you want to use pandas, that's easy:

import array
import pandas as pd

ar = arrow.get('September 20 2010', 'MMMM D YYYY')
df = pd.to_datetime(ar.date())
print(df) # 2010-09-20 00:00:00
Kruupös
  • 5,097
  • 3
  • 27
  • 43
  • thanks @Max the ar object is of type unicode is there a way to pass this to pandas so that pandas interprets it as a date? I have this in a function and want to return a date object to a column in a pandas dataframe. – yoshiserry Oct 26 '16 at 11:45
  • is there anything I can do to account for the difference in the date string between 9 and 19 which is two digits (which you need to explicitly tell archer what format the string is in) before you can convert it to any other format as you mention). At the moment 'MMMM D YYYY' breaks when I try and feed more than one url through it. They are all september 20 2010 or september 9 2010 or august 4 2011 etc. Always long month name, (1-2) digit day, and 4 digit year. – yoshiserry Oct 26 '16 at 12:03
  • See in my answer the [token associated](http://crsmithdev.com/arrow/#tokens), they cover several use cases. I don't think you have to change anything to cover 9 or 20 but please have a look at the docs. It is very easy to understand! – Kruupös Oct 26 '16 at 12:08
  • I updated my answer to make you understand the relevance of the tokens and also to avoid the visitors to get confused. – Kruupös Oct 26 '16 at 12:17