0

I'm working with a data frame on unemployment stats and the date is in format as 2010 AUG, 2018 DEC. I would like to convert it to date format and to include the last day of the month, 2010-31-08 or 08/31/2010. I can't use pd.to_datetime if the month is in string using letters instead of numbers right?

I can only seem to find info on how to convert when the format is using numbers. Will I need to manually change JAN to 01, FEB to 02 etc?

  • Does this answer your question? [How can I convert these dates to the correct format in a Pandas Dataframe?](https://stackoverflow.com/questions/75710856/how-can-i-convert-these-dates-to-the-correct-format-in-a-pandas-dataframe) – Marcelo Paco Apr 22 '23 at 23:34
  • 1
    `df['dates'] = pd.to_datetime(df['date_string'], format='mixed') + pd.offsets.MonthEnd(0)` – Nick Apr 23 '23 at 01:34

1 Answers1

0

Convert String to Date

You can still convert the date string to a datetime object using the datetime.strptime() function from the built-in datetime module in Python.

Here's an example code to convert your date string to a datetime object with the last day of the month:

from datetime import datetime
import pandas as pd

# create a sample DataFrame
df = pd.DataFrame({'date_string': ['2010 AUG', '2018 DEC']})

# define a function to convert date strings to datetime objects with last day of month
def convert_date(date_string):
    # convert month name to number
    month_num = datetime.strptime(date_string, '%Y %b').month
    # get the last day of the month
    last_day = pd.Timestamp(datetime.strptime(date_string, '%Y %b').replace(day=1) \
                           + pd.DateOffset(months=1) - pd.DateOffset(days=1))
    # combine year, month, and day to create datetime object
    datetime_obj = datetime.strptime(f"{date_string.split(' ')[0]}-{month_num:02d}-{last_day.day:02d}", '%Y-%m-%d')
    return datetime_obj

# apply the function to the 'date_string' column to create a new 'date' column
df['date'] = df['date_string'].apply(convert_date)

# print the resulting DataFrame
print(df)

Output will be:

  date_string       date
0     2010 AUG 2010-08-31
1     2018 DEC 2018-12-31

Explanation

  1. Create a sample DataFrame with a column of date strings in the format of YYYY MON.
  2. Then, define a function convert_date() that takes a date string as input, converts the month name to its corresponding number using datetime.strptime(), gets the last day of the month using pd.Timestamp(), and combines the year, month, and day to create a datetime object.
  3. Then apply this function to the 'date_string' column using the apply() method to create a new 'date' column with datetime objects.
  4. Print the resulting DataFrame.
misbahm3
  • 101
  • 5