0

I've asked many questions about this and all the answers were really helpful...but once again my data is weird and I need help...Basically, what I want to do is find the average speed at a certain range of intervals...lets say from 6 s to 40 s my average speed would be 5 m/s...etc etc.. So it was pointed out to me to use this code...

library(IRanges)
idx <- seq(1, ncol(data), by=2)
# idx is now 1, 3, 5. It will be passed one value at a time to `i`.
# that is, `i` will take values 1 first, then 3 and then 5 and each time
# the code within is executed.
o <- lapply(idx, function(i) {  
    ir1 <- IRanges(start=seq(0, max(data[[i]]), by=401), width=401)
    ir2 <- IRanges(start=data[[i]], width=1)
    t <- findOverlaps(ir1, ir2)
    d <- data.frame(mean=tapply(data[[i+1]], queryHits(t), mean))
    cbind(as.data.frame(ir1), d)
})

which gives this output

# > o
# [[1]]
#   start end width mean
# 1     0 400   401 1.05
# 
# [[2]]
#   start end width mean
# 1     0 400   401  1.1
# 
# [[3]]
#   start end width     mean
# 1     0 400   401 1.383333

So if I wanted it to be every 100 s... I'll just change ir1 <- ....., by = 401 to become by=100.

But my data is weird because of a few things

  1. my data doesnt always start with 0 s sometimes it starts at 20 s...depending on the specimen and whether it moves
  2. My data collection does not happen every 1s or 2s or 3s. Hence sometimes I get data 1-20 s but it skips over 20-40 s simply because the specimen does not move.
  3. I think the findOverlaps portion of the code affects my output. How can I get rid of that without disturbing the output?

Here is some data to illustrate my troubles...but all of my real data ends in 2000s

Time    Speed   Time    Speed   Time    Speed
6.3 1.6 3.1 1.7 0.3 2.4
11.3    1.3 5.1 2.2 1.3 1.3
13.8    1.3 6.3 3.4 3.1 1.5
14.1    1.0 7.0 2.3 4.5 2.7
47.4    2.9 11.3    1.2 5.1 0.5
49.2    0.7 26.5    3.3 5.9 1.7
50.5    0.9 27.3    3.4 9.7 2.4
57.1    1.3 36.6    2.5 11.8    1.3
72.9    2.9 40.3    1.1 13.1    1.0
86.6    2.4 44.3    3.2 13.8    0.6
88.5    3.4 50.9    2.6 14.0    2.4
89.0    3.0 62.6    1.5 14.8    2.2
94.8    2.9 66.8    0.5 15.5    2.6
117.4   0.5 67.3    1.1 16.4    3.2
123.7   3.2 67.7    0.6 26.5    0.9
124.5   1.0 68.2    3.2 44.7    3.0
126.1   2.8 72.1    2.2 45.1    0.8

As you can see from the data, it doesnt necessarily end in 60 s etc sometimes it only ends at 57 etc

EDIT add dput of data

structure(list(Time = c(6.3, 11.3, 13.8, 14.1, 47.4, 49.2, 50.5, 
57.1, 72.9, 86.6, 88.5, 89, 94.8, 117.4, 123.7, 124.5, 126.1), 
    Speed = c(1.6, 1.3, 1.3, 1, 2.9, 0.7, 0.9, 1.3, 2.9, 2.4, 
    3.4, 3, 2.9, 0.5, 3.2, 1, 2.8), Time.1 = c(3.1, 5.1, 6.3, 
    7, 11.3, 26.5, 27.3, 36.6, 40.3, 44.3, 50.9, 62.6, 66.8, 
    67.3, 67.7, 68.2, 72.1), Speed.1 = c(1.7, 2.2, 3.4, 2.3, 
    1.2, 3.3, 3.4, 2.5, 1.1, 3.2, 2.6, 1.5, 0.5, 1.1, 0.6, 3.2, 
    2.2), Time.2 = c(0.3, 1.3, 3.1, 4.5, 5.1, 5.9, 9.7, 11.8, 
    13.1, 13.8, 14, 14.8, 15.5, 16.4, 26.5, 44.7, 45.1), Speed.2 = c(2.4, 
    1.3, 1.5, 2.7, 0.5, 1.7, 2.4, 1.3, 1, 0.6, 2.4, 2.2, 2.6, 
    3.2, 0.9, 3, 0.8)), .Names = c("Time", "Speed", "Time.1", 
"Speed.1", "Time.2", "Speed.2"), class = "data.frame", row.names = c(NA, 
-17L))
agstudy
  • 119,832
  • 17
  • 199
  • 261

1 Answers1

0

sorry if i don't understand your question entirely, could you explain why this example doesn't do what you're trying to do?

# use a pre-loaded data set
mtcars

# choose which variable to cut
var <- 'mpg'

# define groups, whether that be time or something else
# and choose how to cut it.
x <- cut( mtcars[ , var ] , c( -Inf , seq( 15 , 25 , by = 2.5 ) , Inf ) )

# look at your cut points, for every record
x 

# you can merge them back on to the mtcars data frame if you like..
mtcars$cutpoints <- x
# ..but that's not necessary

# find the mean within those groups
tapply( 
    mtcars[ , var ] , 
    x ,
    mean
)


# find the mean within groups, using a different variable
tapply( 
    mtcars[ , 'wt' ] , 
    x ,
    mean
)
Anthony Damico
  • 5,779
  • 7
  • 46
  • 77
  • by saying variable to cut u mean Time etc – Marco De Niro Feb 23 '13 at 15:26
  • @MarcoDeNiro i think that's what you want? copy and paste my code into R and tell me why those results aren't what you need.. :) – Anthony Damico Feb 23 '13 at 15:29
  • Thanks anthony but I do prefer the output from the other code...But the findOverlaps function, I think, messes some of my data up...What does findOverlaps does exactly?? I say this because the code I mentioned previously works welll when I'm only handling 1 set of data...meaning I only have one column of time and speed – Marco De Niro Feb 23 '13 at 15:32
  • @MarcoDeNiro then you need to provide a [reproducible example - click here](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – Anthony Damico Feb 23 '13 at 15:34
  • The OP give a reproducible example! could you explain why this answer do what the OP trying to do? – agstudy Feb 23 '13 at 15:46
  • @agstudy disagree it's reproducible :) no `data` / no `findOverlaps` function definition.. i just took a shot in the dark hoping OP wanted groupwise means – Anthony Damico Feb 23 '13 at 15:50
  • @AnthonyDamico you're right it is not straight reproducible...I update OP question to add the missing facts. – agstudy Feb 23 '13 at 15:57