1

So... once again I'm having trouble with datetime formats. This because IDK why amazon's database continously changes the date format and each time it's a trouble. Today Im asking for your help to convert this string to datetime object. I just can't find the right format.

"ene" it's like Jan but in spanish. Thing is that even though locale.getlocale() is set to spanish, it doesnt convert it (but it does convert the english date strings). I've also tried replacing "ene" to "Jan" and doesn't work either. I'm still getting the "time data '2 ene. 2020 9:54:46' does not match format '%d %b. %Y %H:%M:%S'" error.

data={'fecha/hora': {0: '2 ene. 2020 9:54:46',
  1: '2 ene. 2020 10:18:51',
  2: '2 ene. 2020 10:19:18',
  3: '2 ene. 2020 11:58:04',
  4: '2 ene. 2020 15:56:51'},
 'Id. de liquidación': {0: 12493053261,
  1: 12493053261,
  2: 12493053261,
  3: 12493053261,
  4: 12493053261}}
df=pd.DataFrame(data)
df["fecha"]=prueba.apply(lambda x: datetime.datetime.strptime(x["fecha/hora"],"%d %b. %Y %H:%M:%S"),axis=1)

If you guys know any tutorial to help me out with this i'll be glad to see it. Honestly i'm following the documentation but i just can't figure out how to work with datetime objects efficiently. Also because each Database I reaceive has differents formats in different lenguages.

also, the original date format is with tz "gmt-8". I erased it but if theres a way to work with it would be even better.

data={'fecha/hora': {0: '2 ene. 2020 9:54:46 GMT-8',
  1: '2 ene. 2020 10:18:51 GMT-8',
  2: '2 ene. 2020 10:19:18 GMT-8',
  3: '2 ene. 2020 11:58:04 GMT-8',
  4: '2 ene. 2020 15:56:51 GMT-8'},
 'Id. de liquidación': {0: 12493053261,
  1: 12493053261,
  2: 12493053261,
  3: 12493053261,
  4: 12493053261}}
df=pd.DataFrame(data)

thanks in advance!

2 Answers2

0

I will suggest you to use a library like arrow. It allows use to specify a local (by default it uses "en_US") to use that can also be different from your local in your PC and change in the code if you need to parse multiple local datetime formats. You can finde more info about arrow on their documentation. It is simple to use and it gives you a lot of different features and a rich guide to learn how to use it.

Here an example on how to parse your specified spanish datetime:

import arrow  # installed via pip

df["fecha"]=df.apply(lambda x: arrow.get(x["fecha/hora"], "D MMM. YYYY H:mm:ss", locale="es_ES").datetime, axis=1)
0

It turns out the '.' is part of the abbreviated Spanish month. So you don't have to declare it again. You can see the values in:

datetime.date(2020, 1, 1).strftime('%b')

which gives you ene.. So if you change your date format to:

datetime.datetime.strptime('2 ene. 2020 10:18:51', "%d %b %Y %H:%M:%S")

rather than:

datetime.datetime.strptime('2 ene. 2020 10:18:51', "%d %b. %Y %H:%M:%S")

that works for me.

Also, pandas does have a to_datetime() function that you could use for datetime conversions of a series. It uses the same format strings as the datetime module. So this would achieve what you're after:

df["fecha"] = pd.to_datetime(df["fecha/hora"], format="%d %b %Y %H:%M:%S")
robbo
  • 525
  • 3
  • 11
  • omg, i literally copied and pasted your code and i still get the error... python still doesn't matches the datetime pattern... df["fecha"] = pd.to_datetime(df["fecha/hora"], format="%d %b %Y %H:%M:%S") – Gustavo Zárate Dec 24 '20 at 00:02
  • I'm a bit puzzled by that, because it does work for me. So the locale `locale.getlocale(locale.LC_TIME)` for me is `('es_ES', 'ISO8859-1')` – robbo Dec 24 '20 at 00:07
  • well, something's wrong with me cause if i run the same code, I get (None,None)... if i run locale.getlocale() i get ('Spanish_Mexico', '1252')... but it should work isn´t it? iti's the same language thatn spain...(es_ES) – Gustavo Zárate Dec 24 '20 at 00:09
  • Can you set the local for `LC_TIME` to `es_ES`? Use the following: `locale.setlocale(locale.LC_TIME, "es_ES")`. And then try again? – robbo Dec 24 '20 at 00:11
  • It worked... but now i'm afraid it's not going to work with the rest of the dataframes that were working ok... is there a way to efficiently work with several date formats??? I mean... is setting and changing the local time for each data base the correct way to approach working with datetimes? I mean... amazon changes datetime format (i swear i have no idea what for) each week.. each week i have to be posting my doubts here... – Gustavo Zárate Dec 24 '20 at 00:16
  • Yes, not sure about that one... given that the date conversion fails when you're using the wrong locale information, you could iterate over a list of possible locale values that you have encountered in the past and catch the error? Not sure if that is the best approach. – robbo Dec 24 '20 at 00:24
  • yeah... Thanks a lot! I think i'm gonna have to iterate over a list as you sugest! thanks!! – Gustavo Zárate Dec 24 '20 at 00:59