2

I have the following date string: '3 févr. 2015 14:26:00 CET'

datetime.datetime.strptime('03 févr. 2015 14:26:00', '%d %b %Y %H:%M:%S')

Parsing this failed with the error:

ValueError: time data '03 f\xc3\xa9vr. 2015 14:26:00' does not match format '%d %b %Y %H:%M:%S'

I tried to loop over all locales with locale.locale_alias:

for l in locale.locale_alias:
    try:
        locale.setlocale(locale.LC_TIME, l)
        print l,datetime.datetime.strptime('03 févr. 2015 14:26:00', '%d %b %Y %H:%M:%S')
        break
    except Exception as e:
        print e

but I was not able to find the correct one.

jfs
  • 399,953
  • 195
  • 994
  • 1,670
seb835
  • 366
  • 6
  • 16

2 Answers2

3

To parse localized date/time string using ICU date/time format:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from datetime import datetime
import icu  # PyICU
import pytz # $ pip install pytz

tz = icu.ICUtzinfo.getDefault() # any ICU timezone will do here
df = icu.DateFormat.createDateTimeInstance(icu.DateFormat.MEDIUM,
                                           icu.DateFormat.MEDIUM,
                                           icu.Locale.getFrench())
df.setTimeZone(tz.timezone)

ts = df.parse(u'3 févr. 2015 14:26:00 CET') #NOTE: CET is ignored
naive_dt = datetime.fromtimestamp(ts, tz).replace(tzinfo=None)
dt = pytz.timezone('Europe/Paris').localize(naive_dt, is_dst=None)
print(dt) # -> 2015-02-03 14:26:00+01:00

df.applyPattern() could be used to set a different date/time pattern (df.toPattern()) or you could use icu.SimpleDateFormat to get df from the format and the locale directly.

It is necessary to use an explicit ICU timezone (so that df.parse() and .fromtimestamp() could use the same utc offset) because icu and datetime may use different timezone definitions.

pytz is used here, to get a proper UTC offset for past/future dates (some timezones may have different utc offsets in the past/future including reasons unrelated to DST transitions).

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • Martijn Pieters, J.F.Sebastian : Thanks a lot for your precious help. – seb835 Feb 11 '15 at 07:24
  • Hi! two questions 1) how do I install it on Windows? 2) maybe it exists something new or better NOW? – Vasyl Kolomiets May 22 '22 at 06:53
  • @VasylKolomiets: no idea to both but my guess would be: 1- try `python -m pip install PyICU` and if it fails, google corresponding error 2- it might depending on your use-case/constraints. – jfs May 22 '22 at 08:34
  • @jfs Here [link](https://ru.stackoverflow.com/questions/1412499/%d0%9e%d0%b1%d1%80%d0%b0%d0%b1%d0%be%d1%82%d0%ba%d0%b0-%d0%b4%d0%b0%d1%82-%d1%80%d0%b0%d0%b7%d0%bd%d1%8b%d1%85-%d0%bb%d0%be%d0%ba%d0%b0%d0%bb%d0%b8%d0%b7%d0%b0%d1%86%d0%b8%d0%b9-%d0%b2-pandas-pyicu-%d0%b8%d0%bb%d0%b8-%d1%80%d1%83%d1%87%d0%ba%d0%b0%d0%bc%d0%b8) I've asked it. – Vasyl Kolomiets May 22 '22 at 08:37
0

Your format includes a dot for the abbreviation and uses 4 characters:

'03 févr. 2015 14:26:00'
#      ^^

but if I set the locale to fr_FR and format the same date:

>>> import locale, datetime
>>> locale.setlocale(locale.LC_TIME, ('fr', 'UTF-8'))
'fr_FR.UTF-8'
>>> datetime.datetime(2015, 2, 3, 14, 26).strftime('%d %b %Y %H:%M:%S')
'03 f\xc3\xa9v 2015 14:26:00'
>>> print datetime.datetime(2015, 2, 3, 14, 26).strftime('%d %b %Y %H:%M:%S')
03 fév 2015 14:26:00

You'll notice only 3 characters are used and no dot is included. Parsing the date only supports the same 3 character abbreviations:

>>> datetime.datetime.strptime('03 fév 2015 14:26:00', '%d %b %Y %H:%M:%S')
datetime.datetime(2015, 2, 3, 14, 26)

You could try the parsedatetime library instead, others have had success parsing French dates with that tool.

Community
  • 1
  • 1
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Java Friends are laughing, as it works fine for them with following code : `DateFormatSymbols dfs = new DateFormatSymbols(Locale.FRENCH); SimpleDateFormat sdf = new SimpleDateFormat("dd MMM yy", dfs); Date date = sdf.parse("09 févr. 08"); System.out.println("--> " + date);` – seb835 Feb 10 '15 at 11:07
  • Sure, but this is a different language, and `datetime` parsing is moving along but a little slowly. I'm sure the Python project would love to receive additional help fixing those issues. – Martijn Pieters Feb 10 '15 at 11:18
  • @seb835: you could use PyICU if you like (the same backend that is probably used by Java). `parsedatetime` can also use it – jfs Feb 10 '15 at 20:42
  • @J.F.Sebastian: good point; I missed that `parsedatetime` can make use of PiICU if installed. – Martijn Pieters Feb 10 '15 at 20:45