I am attempting to replicate some GitHub charts across multiple repositories. I have a dataframe that holds the data I'm looking to graph that looks like this, named stats
:
Dev Commits Repository Date
Dev1 2 Repo1 2020-01-03
Dev2 1 Repo1 2020-01-03
Dev1 1 Repo1 2020-01-04
...
Dev3 4 Repo1 2020-10-31
Dev1 1 Repo1 2020-10-31
Dev1 1 Repo2 2020-01-02
Dev2 5 Repo2 2020-01-03
...
I am using seaborn to plot these numbers like so:
plt.figure(figsize=(15,7))
sns.lineplot(
x="Date",
y="Commits",
ci=None,
data=stats
).set_title('Commits')
This produces a nice line graph, but after looking at it I realized it's not correct. My dataframe is sorted by repository then date (and artifact of how I pulled the data from git log
). It is only graphing Repo1
, because it spans the entire year. Other repositories are not being added to the Commits
column.
How can I graph total commits, across all repositories, through out the year? Some repositories will have multiple developers committing code on the same date on the same repository, those should be added together. Others will have commits on dates when other repositories don't.