I have a numerical variable, call it "Blah". Blah is measured at various time intervals throughout the day and is an always increasing count. I want to find the difference between the first and last observation of Blah for each day and produce a table of the Total Amount of Increase of Blah per day.
Slightly more complicated is that if Blah is high enough, it will reset to a very low number. This always happens at the same (currently unknown) number and at maximum rate of once per day.
A few more details that might be important:
Blah is measured at different named locations as well. I would like a dataframe of day totals by location. :)
The time variable is in format "mm/dd/yyyy hh:mm:ss"
This is what I've come up with for a general outline. An issue I have is that I haven't worked with POSIXct objects much and don't know how to go about grabbing these values and making this happen.
A<-first value of Day
B<-last value of Day
C<-Maximum value of Blah from a day where reset happens (last value before reset)
For (each Day)
For (each Location)
If A < B
Then
DayTotal = B-A
Else
DayTotal = (C-A)+B
Edit:
I had some data here in the wrong format. The below is the correct format.
Thank you in advance for the help!
-Michael
Also, on a day where Blah resets, A will always be more than B.
EDIT NUMBER 2
OMG I am a terrible person. The data actually looks like this
DESCRIPTION rawCount localDateTime
1 Arch Exit 33166 2014-05-23 07:55:05
2 Arch Exit 33167 2014-05-23 08:00:06
3 Arch Exit 33170 2014-05-23 08:10:06
4 Arch Exit 33173 2014-05-23 08:15:05
5 Arch Exit 33175 2014-05-23 08:20:05
6 Arch Exit 33178 2014-05-23 08:25:06
7 Northside 48073 2014-05-24 15:01:40
8 Northside 48119 2014-05-24 15:05:49
9 Northside 48167 2014-05-24 15:10:59
10 Northside 48237 2014-05-24 15:20:49
11 Northside 73 2014-05-24 15:25:59
12 Northside 350 2014-05-24 15:35:49
13 Northside 1430 2014-05-24 15:44:06
14 Northside 2554 2014-05-24 16:00:49
(supposing that the above data was complete per day) I would like the results to look like
DESCRIPTION totalCount Date
Arch Exit 12 2014-05-23
Northside 2718 2014-05-23
Another Edit
Ok so using the answer below I did the following which I think made it work.
rawDiff is an already existing variable (done in excel....yikes) and parse_date_time is a function from the Lubridridate package, "Full" is my data and "localdate" is the date variable I wanted.
blahblah<-with(Full, tapply(rawDiff, list(parse_date_time(Full$localDate, "mdy"), DESCRIPTION), function(x) {
sum(x[x>=0])}))
There was some weirdness with NA's that using a separate pre-made difference variable seemed to help. Also, when it reset the differences were negative so I just took the non-negative differences.