0

I have a question regarding how to impute a second level index after a 2 layered groupby in pandas.

I have a dataframe of patient information. I'm trying to track when these reports were generated so I can chart them in pyplot. The three things that matter for what I'm trying to do is when the report was generated, the technology that generated the report, and the count of each technology per month. I have this line of code so far

frame.groupby([pd.Grouper(key="reportDate", freq='M'), pd.Grouper(key="sourceFilePathTechnology")], observed= False).count()

which generates the following table.

Dataframe

I'm close to what I'm trying to get, but I'm missing something and I can't find what I'm looking for in the documentation or in another SO post. The final missing step is that I would like to have every technology represented in the sourceFilePathTechnology index per month. so 2016-03-31 only has FSG, when I need it to also have NTP, MOL, even if the count is 0. And I need this for every month in the reportDate index Does anyone know how I can resolve this?

Thank you to anyone who can offer some input!

Phil
  • 192
  • 1
  • 11
  • kindly post a sample of the original dataframe, and your expected output as well. https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – sammywemmy Jan 23 '20 at 21:21

1 Answers1

0

Found my answer. I needed to google pandas group by and count 0 and came across this post: Pandas groupby for zero values

the answer was

frame.groupby([pd.Grouper(key="reportDate", freq='M'), pd.Grouper(key="sourceFilePathTechnology")], observed= False).count().unstack(fill_value=0).stack()
Phil
  • 192
  • 1
  • 11