0

I'm stuck trying to figure how to parse this object containing date that is formatted as 18.26.01 (object formatted with periods) in pandas. I'm using the pd.to_datetime(youtube_US['trending_date']) method but it's returning a parsing error. The error is as follows: ParserError: month must be in 1..12: 17.14.11 How do I parse this date so that it returns a proper datetime object? Do I need to use any kind of loop?

2 Answers2

0

The Error tells you to explicitly mention month in the front. You can handle the error by reformatting the column.

import pandas as pd

youtube_US = {'trending_date': ['18.26.01', '18.26.01']}
df = pd.DataFrame(data=youtube_US)

def datetime_split(df):
    split = df.split('.')
    return split[1]+"."+split[2]+"."+split[0]

# Reformat 'trending_date' column
df['trending_date'] = df['trending_date'].apply(datetime_split) 

# Select only date from column
df['trending_date'] = pd.to_datetime(df['trending_date']).dt.date
print(df)

I hope this resolves your error.

or simply use format as per Buran's comment

import pandas as pd
youtube_US = {'trending_date': ['18.26.01', '18.26.01']}
df = pd.DataFrame(data=youtube_US)
df['trending_date']= pd.to_datetime(df['trending_date'], format="%y.%d.%m")
df['trending_date'] = pd.to_datetime(df['trending_date']).dt.date
print(df)
0

There is another field in the notebook that I'm working with which I don't understand.

youtube_US['count_max_view']=youtube_US.groupby(['video_id'])['views'].transform(max)

I don't understand the purpose of .transform(max) and what it's doing and in fact the whole line of code.

here is the info on the dataset:

enter image description here