I have a dataframe of time reports. The relevant columns are (with example data):
- Person: A, B, C, ... (character type)
- Hours: 2.5, 1, 6, ... (numeric type)
- yearMonth: 2016-02, 2017-11, 2014-09, ... (character type)
I use this to do a bar plot on all the data:
ggplot(data = time_reports, aes(x=time_reports$yearMonth, y=time_reports$Hours)) +
geom_col()
The plot is plausible, based on a standard workweek and the number of employees on the team who are reporting for that month (the team grew during this time, and several employees did not start reporting time for a few months after joining the team):
The x-axis label is time_reports$yearMonth and goes from mid-2014 to the end of 2017. The y-axis label is time_reports$Hours and is measured in hours. Each bar is the number of hours reported per month.
Now let me add a facet. Here's the new code, with facet_wrap
added:
ggplot(data =time_reports, aes(x=time_reports$yearMonth, y=time_reports$Hours)) +
geom_col() +
facet_wrap(~Person)
I get 8 facets, which is expected since the team has 8 members. However, all of the facets has bogus data. For example, here is one of the facets:
This employee didn't even join the team until mid-2015 and didn't start time tracking until early 2016. Additionally, this employee has been diligent about time tracking since starting the practice. What you should see are fairly level bars starting about halfway through this plot. (Sorry, this facet is in the middle of the facets, so the X and Y scales aren't adjacent to it. X scale is same as prior plot on this page, and Y is the same as the next plot on this page.)
I exported the dataframe to a CSV. I used Excel's filtering and PivotTable features to confirm that the underlying data is rational and that what ggplot2 is showing is not present in the data.
This one is beyond insane. For this person to have reported 800 hours in a month, that would mean more than 24 hours worked per day! Again, Excel proves that the underlying data does not show any work month remotely similar to this. The most this guy ever reported in a month was 176.75 hours.
Why is ggplot2's facet_wrap
function warping the data so badly?