How to sum the variables according to the times it falls into t1[i] < t[j] < t2[i]

Question

This is another versions of R - How to sum objects in a column between an interval defined by conditions on another column

I have 3 time variables t1, t2 and t3 and a respective column with numbers. I want to sum up the variables from "numbers" together that would fall under time between t1[i] and t2[i]. E.g:

t1 <- c(1.12, 2.16, 3.18, 4.56, 8.90, 29.36, 30.30, 31.30, 36.90, 50.01)
t2 <- c(2.14, 2.77, 3.65, 4.78, 8.99, 30.01, 31.07, 31.89, 40.30, 55.08)
t3 <- c(1.16, 1.55, 1.35, 2.17, 2.18, 2.19, 2.34, 3.30, 4.59, 8.91, 29.99, 30.32, 30.98, 31.32, 37.00, 52.00, 54.00)
numbers <- c(7,1,2,5,5,6,9,12, 13, 22, 7, 1, 7, 11, 21, 29)

The output I am looking for the output like this below: Here I have first 3 numbers in t3 satisfy my critera and so on, they are summed up and stored in a new vector "output". PLease note that the "output" here is written by myself and not computed (shown as example). I can compute the first set, however my i stays at the same value and I cannot go on... Hope you can help me, thank you for your time.

output = (7+1+2,5+5+6+9,12,13,22,7,1,7,11,21+29) 
output = (10, 25, 12, 13, 22, 7, 1, 7, 11, 50)

So far This is what I have:

t1 <- c(1.12, 2.16, 3.18, 4.56, 8.90, 29.36, 30.30, 31.30, 36.90, 50.01)
t2 <- c(2.14, 2.77, 3.65, 4.78, 8.99, 30.01, 31.07, 31.89, 40.30, 55.08)

t3 <- c(1.16, 1.55, 1.35, 2.17, 2.18, 2.19, 2.34, 3.30, 4.59, 8.91, 29.99, 30.32, 30.98, 31.32, 37.00, 52.00, 54.00)
numbers <- c(7,1,2,5,5,6,9,12, 13, 22, 7, 1, 7, 11, 21, 29)

i = 1
j = 1
k = 1
N = NULL
Sums = NULL

while (j < length(t1))
{
  while (i < length(t3))
    {
      if (t3[i] > t1[j] & t3[i] <= t2[j])
      {
        N[i] <- numbers[i]

      }
      i = i + 1
    } 
  Sums[k] = sum(N)   
  k = k + 1
  j = j + 1
}

Shoudn't `t3` and `numbers` have the same number of elements? They have 17 and 16 elements respectively. — Vincent Zoonekynd, Sep 23 '13 at 14:23
yeah you;re right, they should be same length. I was simply drawing out an example. They are very long 1000+ — Pork Chop, Sep 23 '13 at 14:36

blmoore · Answer 1 · 2013-09-23T14:31:29.190

3

Unless I've misunderstood what you're shooting for, there's no need for all the whiles and ifs.

Firstly, organise your data, i.e.:

dat <- data.frame(time=t3[1:16], obs=numbers)

Then use cut to cut the data into intervals, and sum over those with tapply, something like:

all <- tapply(dat$obs, cut(dat$time, breaks=sort(c(t1, t2))), FUN=sum)
# omit the gaps between intervals
all[seq(1,length(all),by=2)]
(1.12,2.14] (2.16,2.77] (3.18,3.65] (4.56,4.78]  (8.9,8.99]   (29.4,30] (30.3,31.1] (31.3,31.9] (36.9,40.3]   (50,55.1] 
     10          25          12          13          22           7           8          11          21          29

edited Sep 23 '13 at 14:31

answered Sep 23 '13 at 14:24

blmoore

1,487
16
31

Thank you for your Help blmoore, really appreciate it. The way the sum is carried out is that it sums all the numbers that fall into category of being between the t1 and t2, unfortunately they're not constant and vary all the time. However the t3 is always in between t1[i] and t2[i]. For clarity t1 and t2 are the same vector where they are t1[i] and t1[i+1], I just specified it like that so I can store the last digit from my previous manipulations. There could be enormous amount of numbers that would be in-between t1 and t2, (up to 100). And I have to sum them all up :(. – Pork Chop Sep 23 '13 at 14:49
1

Ah, in that case you only need to use `breaks=t1` in the call to `cut` and `all` will hold all of your results (no need for the skipping index) – blmoore Sep 23 '13 at 15:00
My times are at nano second intervals and they are not unique, I was trying to manipulate it. And I do understand how you would go about it, however it doesnt seem to work "'breaks' are not unique". I would specifically need to set a range with the breaks – Pork Chop Sep 23 '13 at 15:12
They're not unique breaks? In most applications they probably should be unless you want to double count observations? – blmoore Sep 23 '13 at 15:15
Thank you ever so much for all the help and for your of the insightful input, it really helped. I'm more used to C++ structure, hence I try to program that way, and I structured the question in this way. You're very good, thank you once again!!! – Pork Chop Sep 23 '13 at 17:28

score 2 · Accepted Answer · answered Sep 23 '13 at 16:03

k and j are the same in your loops, and the inner loop can be replaced with a vectorized version:

t3 <- head(t3,-1) # editing the error the OP left in place
nint <- length(t1)
N <- vector('list',nint)
Sums <- vector('integer',nint)
for (i in 1:nint){
    N[[i]] <- numbers[which(findInterval(t3,c(t1[i],t2[i]))==1)]
    Sums[i] <- sum(N[[i]])
}

Comment 1. This gives the same result as @bmoore's, with the numbers stored in N and then summed in Sums. You need N to be a list to do what you were intending, I think, while this line

N[i] <- numbers[i]

was overwriting a single value, instead of adding it to the vector as @holgrich did with c(N,numbers[i]).

Comment 2. findInterval can do unexpected things when t3 equals either t1[i] or t2[i], so you could instead use which(t3 > t1[i] & t3 < t2[i]) to state the inequalities explicitly.

Comment 3. Going without loops entirely, as in @bmoore's answer, is the more standard thing to do in R.

Thank you Frank for your help, I really appreciate it! I was decoding every line trying to understand why you went this way! All the best! — Pork Chop, Sep 23 '13 at 17:29

score 1 · Answer 3 · answered Sep 23 '13 at 14:14

You have to reset i and N while running your loops.

t1 <- c(1.12, 2.16, 3.18, 4.56, 8.90, 29.36, 30.30, 31.30, 36.90, 50.01)
t2 <- c(2.14, 2.77, 3.65, 4.78, 8.99, 30.01, 31.07, 31.89, 40.30, 55.08)

t3 <- c(1.16, 1.55, 1.35, 2.17, 2.18, 2.19, 2.34, 3.30, 4.59, 8.91, 29.99, 30.32, 30.98, 31.32, 37.00, 52.00, 54.00)
numbers <- c(7,1,2,5,5,6,9,12, 13, 22, 7, 1, 7, 11, 21, 29)

i = 1
j = 1
k = 1
N = c()
Sums = NULL

while (j < length(t1)){
  while (i < length(t3)){
      if (t3[i] > t1[j] & t3[i] <= t2[j]) N <- c( N, numbers[i] )
      i = i + 1
  }
  i = 1 
  Sums[k] = sum(N)   
  N = c()
  k = k + 1
  j = j + 1
}

Forgot to thank you for your solution, it was very useful. Im really grateful that people like you exist and are happy to help out! — Pork Chop, Sep 23 '13 at 17:26

How to sum the variables according to the times it falls into t1[i] < t[j] < t2[i]

3 Answers3