0

I'm using pandas.read_excel() to turn excel tables into dataframes to work with in Python. This tables contain date columns in the following format: 01Jun2018. When I run the instruction, the tables are turned into dataframes just fine. The issue comes from the fact that I'm currently working in Mexico where month abreviations are spelled in spanish. Because of this the date columns show some cells with the correct datetime-type info, but the cells which originally contain months that do not correspond to month names in spanish (for example: april != abril, January != Enero) show the original strings. I need to do some operations with dates, so this columns must be datetime-type entirely.

I've tried switching the locale to en_US but nothing happened.

Erfan
  • 40,971
  • 8
  • 66
  • 78
  • Welcome to StackOverflow. Could you add some example data so we can try ourselves? 5 rows is enough, here are instruction how to make a good pandas question: https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – Erfan Apr 02 '19 at 18:53

1 Answers1

0

You need to set the locale using locale. If you already have a dataframe like this:

       dates
0  01Ene2018
1  20Feb2018
2  01Jun2018

Then you need to change the type of that column using pd.to_datetime after setting the locale:

import locale
import pandas as pd
locale.setlocale(locale.LC_ALL, locale.locale_alias["es_mx"])
df.dates = pd.to_datetime(df.dates, format="%d%b%Y")
print(df.dates)

Output:

0   2018-01-01
1   2018-02-20
2   2018-06-01
Name: dates, dtype: datetime64[ns]

This is supposing you have the es_MX locale installed in your system, otherwise you will need to install it.

arinarmo
  • 375
  • 1
  • 11