1

I am using R to extract data from a process historian using SQL. I have two dataframes, one of net weights (NetWt) with timestamps (100 rows) and another of weight setpoints (SetPt) with timestamps (6 rows). The setpoint is changed infrequently but a new bag weight is recorded every 30 seconds. I need to subtract the two such that I get a resultant dataframe of NetWt - SetPt for each timestamp in NetWt. In my last dataset the most recent SetPt timestamp is earlier than the first NetWt timestamp. I need a function that will go through each row in NetWt, take the timestamp, search for the closest timestamp before that time in the SetPt dataframe, return the most recent SetPt and output the difference (NetWt-SetPt).

I have researched merge, rbind, cbind, and I can't find a function to search backwards for the most recent SetPt value and merge that with the NetWt so that I can subtract them to plot the difference with time. Can anyone please help?

Data:

SetPtLines <- "Value,DateTime
51.35,2014-02-10 08:10:49
53.30,2014-02-10 07:52:37
53.10,2014-02-10 07:52:19
51.70,2014-02-10 07:50:26
51.35,2014-02-09 19:25:21
51.40,2014-02-09 19:13:11
51.50,2014-02-09 18:24:53
51.45,2014-02-09 16:10:38
51.40,2014-02-09 15:54:42"
SetPt <- read.csv(text=SetPtLines, header=TRUE)

NetWtLines <- "DateTime,Value
2014-02-11 12:51:50,50.90735
2014-02-11 12:52:24,50.22308
2014-02-11 12:52:55,50.88604
2014-02-11 12:53:27,50.69514
2014-02-11 12:53:58,51.38968
2014-02-11 12:54:29,50.96672"
NetWt <- read.csv(text=NetWtLines, header=TRUE)

There are 100 rows in NetWt.

Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
TrevorRi
  • 11
  • 3

2 Answers2

2

data.table has a roll argument which would probably be very helpful here

library(data.table)
NetWt <- as.data.table(NetWt)
SetPt <- as.data.table(SetPt)

## Only needed if dates are strings: 
## Ensure that your DateTime columns are actually times and not strings
NetWt[, DateTime := as.POSIXct(DateTime)]
SetPt[, DateTime := as.POSIXct(DateTime)]

## Next, set keys to the dates
setkey(NetWt, DateTime)
setkey(SetPt, DateTime)

## Join the two, use roll
NetWt[SetPt, NewValue := Value - i.Value, roll="nearest"]

## It's not clear which you want to subtract from which
SetPt[NetWt, NewValue := Value - i.Value, roll="nearest"]
Ricardo Saporta
  • 54,400
  • 17
  • 144
  • 178
  • Thanks very much for taking the time to give me this code Ricardo. I installed the data.table library and ran the code and I think I understand what it is doing. However it only does the subtraction for the one NetWt value closest to a SetPt value and all other NetWts are NA. I need the code to loop through every NetWt value, then find the nearest SetPt value and calculate NetWt-SetPt for each NetWt so that I can plot the time series difference between the actual net weights and the prevailing setpoint for that period. – TrevorRi Feb 12 '14 at 14:56
  • Thanks Ricardo. I figured out how to replace the NAs with the last SetPt value: NWSP<-transform(NetWt, SP = na.locf(SP)) and I had to modify the roll parameter to roll=-Inf to only select earlier values and now it works great. – TrevorRi Feb 12 '14 at 19:40
0

Here's a solution using xts. Note that your example would be more helpful if SetPt and NetWt included some overlapping observations.

library(xts)
# convert your data to xts
xSetPt <- xts(SetPt$Value, as.POSIXct(SetPt$DateTime))
xNetWt <- xts(NetWt$Value, as.POSIXct(NetWt$DateTime))
# merge them
xm <- merge(xNetWt, xSetPt)
# fill all missing values in the SetPt column with their prior value
xm$xSetPt <- na.locf(xm$xSetPt)
# plot the difference
plot(na.omit(xm$xNetWt - xm$xSetPt))
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
  • Thanks for the solution Joshua. I managed to get the data.table version above working but you have taught me another way in R. Thanks for your time and expertise. – TrevorRi Feb 12 '14 at 19:42