1

I have a string field witch is a date scraped from internet and reads like this:

 "Lun Ene 27, 2014 9:52 am", "Mar Feb 11, 2014 5:38 pm",...

Lun= day of week (lunes/monday) Ene = month (enero/january)

I need to enter them in a mysql table, in a datetime field.

'YYYY-MM-DD HH:MM:SS'

I imagine it is a very common issue and was wondering if someone already have a script to do it or could point out where I could look for...

Thanks in advance!

jfs
  • 399,953
  • 195
  • 994
  • 1,670
user2950162
  • 941
  • 4
  • 13
  • 29
  • Look at [`datetime`](https://docs.python.org/2.7/library/datetime.html) for working with date/time data in Python. – Ffisegydd Apr 21 '14 at 12:37
  • What date format is the first string in? The second one seems to have two months, and the first seems to include to non-english words. – fr1tz Apr 21 '14 at 12:37
  • In particular look at [`strptime` and `strftime`](https://docs.python.org/2.7/library/datetime.html#strftime-and-strptime-behavior) – Ffisegydd Apr 21 '14 at 12:38
  • @fr1tz The second does not have two months, it is also in spanish. `Lun` is "Lunes" = "Monday", and `Mar` is "Martes" = "Tuesday". The months seem to be too "Enero" = "January" and "Febrero" = "February". – Cory Kramer Apr 21 '14 at 12:40
  • @Cyber This is particularly confusing considering that Mar is a common english contraction of March which is the month after Feb! – fr1tz Apr 21 '14 at 12:43
  • Thanks for the comments, put some more clarification on the question – user2950162 Apr 21 '14 at 12:54
  • You also might want to look at http://stackoverflow.com/questions/2090840/python-datetime-localization and http://stackoverflow.com/questions/985505/locale-date-formatting-in-python for hints on using non-English month/day/... names with `strptime`. – lmjohns3 Apr 21 '14 at 13:21

2 Answers2

1
month_of_the_year = ['Ene', 'Feb', 'Mar', 'Abr', 'May', 'Jun', 'Jul', 'Ago', 'Sep', 'Oct', 'Nov', 'Dec']

def convert_to_mysql_format(string):
  explode = string.split()
  day_of_the_month = explode[2][:-1]
  if int(explode[2][:-1]) < 10:
    day_of_the_month = "%02d" % (int(explode[2][:-1]),)

  if explode[5] == 'am':
    time_split = explode[4].split(':')
    if time_split[0] == '12':
      time_split[0] = '00'
    elif int(time_split[0]) < 10:
      time_split[0] = "%02d" % int(time_split[0])

  else:
    time_split = explode[4].split(':')
    if int(time_split[0]) in range(1, 12):
      time_split[0] = str(int(time_split[0]) + 12)


  if month_of_the_year.index(explode[1]) < 12:
    explode[1] = "%02d" % (month_of_the_year.index(explode[1])+1)

  return explode[3]+'-'+explode[1]+'-'+day_of_the_month+' '+time_split[0]+':'+time_split[1]+':00'

print convert_to_mysql_format("Lun Ene 27, 2014 9:52 am")
print convert_to_mysql_format("Lun Ene 27, 2014 9:52 pm")

2014-01-27 09:52:00
2014-01-27 21:52:00

user2950162
  • 941
  • 4
  • 13
  • 29
mahmoh
  • 802
  • 1
  • 9
  • 15
  • Thanks mahmoh! I adjusted the las if to be < 12 and is working perfectly! Just for my info, what is the "%02d" % statement?? – user2950162 Apr 21 '14 at 17:16
  • @user2950162 Firstly thank you for correcting the last if statement, did not debug enough :-). Secondly "%02d" (d)=integer, (02)=if the number representation is 1 digit, add a leading zero. Take in consideration that a negative single digit number like -2 it will not be shown as -02, to have the -02 result you must say "%03". – mahmoh Apr 21 '14 at 17:58
0

By default, Python runs using C locale:

>>> from datetime import datetime
>>> datetime.strptime("Tue Feb 11, 2014 5:38 PM", "%a %b %d, %Y %I:%M %p")
datetime.datetime(2014, 2, 11, 17, 38)
>>> import locale
>>> locale.nl_langinfo(locale.T_FMT_AMPM)
'%I:%M:%S %p'

Changing locale partially helps on my system:

>>> locale.setlocale(locale.LC_TIME, 'es_ES.UTF-8')
'es_ES.UTF-8'
>>> datetime.strptime("Lun Ene 27, 2014 9:52 am"[:-2], "%a %b %d, %Y %I:%M %p")
datetime.datetime(2014, 1, 27, 9, 52)
>>> locale.nl_langinfo(locale.T_FMT_AMPM)
''

T_FMT_AMPM is not set for some reason on my system for es_ES.UTF-8. To fix it, you could manually add 12 hours if the time string ends with 'pm'.

The strftime() and time behaviour is the same.

Note: the locale name may be different on other systems.

jfs
  • 399,953
  • 195
  • 994
  • 1,670