I'm stuck trying to figure how to parse this object containing date that is formatted as 18.26.01 (object formatted with periods) in pandas. I'm using the pd.to_datetime(youtube_US['trending_date']) method but it's returning a parsing error. The error is as follows: ParserError: month must be in 1..12: 17.14.11 How do I parse this date so that it returns a proper datetime object? Do I need to use any kind of loop?
Asked
Active
Viewed 263 times
0
-
1Did you try to pass format string? `pandas.to_datetime()` takes `format` argument – buran Sep 19 '21 at 10:03
2 Answers
0
The Error tells you to explicitly mention month in the front. You can handle the error by reformatting the column.
import pandas as pd
youtube_US = {'trending_date': ['18.26.01', '18.26.01']}
df = pd.DataFrame(data=youtube_US)
def datetime_split(df):
split = df.split('.')
return split[1]+"."+split[2]+"."+split[0]
# Reformat 'trending_date' column
df['trending_date'] = df['trending_date'].apply(datetime_split)
# Select only date from column
df['trending_date'] = pd.to_datetime(df['trending_date']).dt.date
print(df)
I hope this resolves your error.
or simply use format as per Buran's comment
import pandas as pd
youtube_US = {'trending_date': ['18.26.01', '18.26.01']}
df = pd.DataFrame(data=youtube_US)
df['trending_date']= pd.to_datetime(df['trending_date'], format="%y.%d.%m")
df['trending_date'] = pd.to_datetime(df['trending_date']).dt.date
print(df)

Gurjot Singh Mahi
- 93
- 1
- 10
-
Also, you probably don't want to add `.dt.date` as this leaves you with Python datetime.date objects, which won't allow you to use pandas datetime functionality like the dt accessor ;-) – FObersteiner Sep 19 '21 at 10:54
-
-
@RuthbaYasmin great if you found the solution. Please do upvote or accept the answer if the code works. It will help other developers. – Gurjot Singh Mahi Sep 19 '21 at 11:05
0
There is another field in the notebook that I'm working with which I don't understand.
youtube_US['count_max_view']=youtube_US.groupby(['video_id'])['views'].transform(max)
I don't understand the purpose of .transform(max) and what it's doing and in fact the whole line of code.
here is the info on the dataset: