I'm trying to extract a stacked bar chart over periodic time (5 years):
import pandas as pd
categorical = ["RL","CD(others)","DL","ML","ML","ML","DL","ML","DL","DL"]
year = [2014,2014,2015,2015,2016,2017,2019,2021,2022,2022]
df = pd.DataFrame({'year':year,
'keywords':categorical})
df
I tried relevant post1, post2, post3 to resolve the problem:
#solution1:Pivot table
df.pivot_table(index='year',
columns='keywords',
# values='paper_count',
aggfunc='sum')
#df.plot(x='year', y='paper_count', kind='bar')
#solution2: groupby
# reset_index() gives a column for counting after groupby uses year and category
ctdf = (df.reset_index()
.groupby(['year'], as_index=False)
.count()
# rename isn't strictly necessary here; it's just for readability
.rename(columns={'index':'paper_count'})
)
ctdf.plot(x='year', y='paper_count', kind='bar')
At the end, I couldn't figure out how can plot this periodically by counting every 5 yrs:
2000-2005
, 2005-2010
, 2015-2020
, 2020-2025
.
expected output: