I have a data.frame representing the time sheet for several staff over a period of several months spanning 2 years. The data looks like:
Name Month 1 2 3 ... 31 Total Job ... [more columns]
John Smith Aug 2017 1:20 1:20 Typing
Mary Jones Sep 2017 Prooing
John Smith Oct 2017 0:15 1:10 1:25 Typing
...
Jim Miles Feb 2018 1:30 2:10 3:40 Admin
There are 31 columns, each representing a date in the corresponding month. There will be multiple rows with the same Name.
So looking at the first entry, John Smith did 1 hour and 20 minutes of work on 1 August 2017.
What I want to do is to analyse these data in a granular way, e.g.
- How many hours did John Smith spend on Typing in Sept 2017?
- How much Proofing was done in Jan-Feb 2018?
I am a bit stuck on how to proceed in order to have the data to analyse. Suggestions appreciated.
Added for clarification:
Having read three very helpful replies and looked at tidyr
, I have clarified my thoughts and think that I need to modify the data so there is one row for each entry, so the example table will become:
Name Date Duration Job ... [more columns]
John Smith 01 Aug 2017 1:20 Typing
John Smith 02 Oct 2017 0:15 Typing
John Smith 31 Oct 2017 0:15 Typing
...
Jim Miles 02 Feb 2018 1:30 Admin
Jim Miles 03 Feb 2018 2:10 Admin
Date will need to be formatted correctly but that is not major. The problem is matching the day of month to the relevant Month and year to produce the composite date. Any ideas welcome.