1

I have the below dataframe:

year  month  week_num    day
2019   8       31       Thurs
2019   8       31        Fri
2019   8       32        Tues

The day abbreviations are Mon, Tues, Weds, Thurs, Fri, Sat, Sun.

I want to generate another column which will provide me the date in yyyy-mm-dd format. How can I do that? Thanks in advance!

Rory Daulton
  • 21,934
  • 6
  • 42
  • 50
realkes
  • 833
  • 1
  • 12
  • 20

1 Answers1

2

The module datetime gives you this opportunity. This discussion explains how to get the date from the week number.

Then, you can define a function to get the date and apply it to you dataframe.

Here the code:

# Import modules
import datetime

# Your data
df = pd.DataFrame([
                   [2019, 8, 29, "Fri"],
                    [2019, 8, 31, "Sun"],
                   [2019, 8, 29, "Tues"]],
                  columns=["year", "month", "week_num", "day"])

# A value per day
val_day = {"Mon": 0, "Tues": 1, "Weds": 2, "Thurs": 3,
           "Fri": 4, "Sat": 5, "Sun": 6}

# Get the date from the year, number of week and the day


def getDate(row):
    # Create string format
    str_date = "{0}-W{1}-1".format(row.year,
                                     row.week_num - 1)
    print(str_date)
    # Get the date
    date = datetime.datetime.strptime(
        str_date, "%Y-W%W-%w") + datetime.timedelta(days=val_day[row.day])
    # Update date field
    row["date"] = date.strftime("%Y-%m-%d")
    return row


# apply the function to each row
df = df.apply(getDate, axis=1)
print(df)
#    year  month  week_num    day        date
# 0  2019      8         1  Thurs  2019-01-03
# 1  2019      8        29    Fri  2019-07-19
# 2  2019      8        29   Tues  2019-07-16
Alexandre B.
  • 5,387
  • 2
  • 17
  • 40
  • Thanks @Alexandre, the error that I get: `ValueError: ("time data '2019-W30-7' does not match format '%Y-W%W-%w'", 'occurred at index 108')` – realkes Jul 17 '19 at 22:48
  • Do you know which date is rising this error ? The first date ? You can add `print(str_date)` after creating `str_date` in `getDate()` – Alexandre B. Jul 17 '19 at 22:49
  • 1
    The index 108 has the year 2019, month 8, week_num 31 and day Sun. – realkes Jul 17 '19 at 22:51
  • `AttributeError: ("'Series' object has no attribute 'day'", 'occurred at index 0')` – realkes Jul 17 '19 at 23:05
  • Are you sure to have a column named `day` in your dataframe ? – Alexandre B. Jul 17 '19 at 23:06
  • 1
    Sorry, forgot to change that. Now it generates all the dates like `2019-W31-1` – realkes Jul 17 '19 at 23:10
  • Actually it always generates one day later than it supposed to. For instance year 2019, week_num 31 and Thurs supposed to generate 2019-08-01 but it is 2019-08-02. How can I substract one day from all? – realkes Jul 17 '19 at 23:18
  • Are you running the last answer ? (with values for `Monday:0`, `Tuesday:1` ...) – Alexandre B. Jul 17 '19 at 23:19