I have this kind of dataframe.
import pandas as pd
df = pd.DataFrame({'year': [1894, 1976, 1995, 2001, 1993]})
The current dataframe
year
0 1894
1 1976
2 1995
3 2001
4 1993
How can I effectively add one hot encoding columns so that the dataframe would look like this.
The expected dataframe
year 1800s 1900s 2000s
0 1894 1 0 0
1 1976 0 1 0
2 1995 0 1 0
3 2001 0 0 1
4 1993 0 1 0
I already tried the code below and it worked. But I think there is a better solution, can you recommend me what function can I use ? Thank you!
The code
df['year'] = df['year'].astype(str)
df['1800s'] = df['year'].apply(lambda x: 1 if x[:2] == '18' else 0)
df['1900s'] = df['year'].apply(lambda x: 1 if x[:2] == '19' else 0)
df['2000s'] = df['year'].apply(lambda x: 1 if x[:2] == '20' else 0)