0

I have a data frame

> data.frame(Col1=seq(0,24,by=4),x=rnorm(7),y=rnorm(7,50))
  Col1            x        y
1    0 -0.107046196 49.96748
2    4 -0.001515573 50.02819
3    8 -1.884417429 49.80308
4   12  1.692774467 50.45827
5   16 -0.907602775 51.14937
6   20  0.166186536 49.17502
7   24  0.420263825 49.56720

and a variable

t=2

and want to find the subset of the data under which it falls (rows 1 and 2 in this example), and then calculate the ratio in variables x and y, ie

  Col1            x        y
1    0 -0.107046196 49.96748
2    4 -0.001515573 50.02819

then obtain, based on value t, (t-0)/(4-0), and then use that ratio to calculate the position in x and y

I found a fund function in matlab (Find which interval a point B is located in Matlab) and wonder if there is a similar function in R

Specifically, is there a way to determine which interval a variable falls under? And once I find that interval, a way to extract the subset of data?

I can only think of %in% operator currently,

> t %in% df$Col1
[1] FALSE

For more clarity, I have tried

> z=NULL
> for(i in 1:(nrow(df)-1)){
+   z[[i]]=df$Col1[i]:df$Col1[i+1]
+ }
> w=NULL
> for(i in 1:length(z)){
+   w=c(w,t %in% z[[i]])
+ }
> v=which(w==1)
> df[v:(v+1),]
  Col1        x        y
1    0 1.076101 50.17514
2    4 1.971503 47.81647
> 

and now hope there may be a more concise answer, as my real data is >1M rows.

frank
  • 3,036
  • 7
  • 33
  • 65
  • what do you mean by `subset of the data under which it falls`? Not clear how `t=2` is satisfying the rows 1 and 2 as output. – Aramis7d Aug 10 '17 at 08:41
  • are you trying to check if `t` is in the interval defined by two consecutive rows of `Col1`? if yes, then are we taking that the data is sorted on column 1 increasingly, and that there can never be duplicated values in column1? – Aramis7d Aug 10 '17 at 08:43
  • I am using the value of t based solely upon column1, and trying to determine which interval it will be. There are no duplicates – frank Aug 10 '17 at 08:47
  • 1
    `i <- cut(t, df$Col1) ; (t - df$Col1[i]) / diff(df$Col1)[i]` should get you started. See `?cut` – Aurèle Aug 10 '17 at 08:48
  • Possible duplicate of [findInterval() with right-closed intervals](https://stackoverflow.com/questions/13482872/findinterval-with-right-closed-intervals) – Aramis7d Aug 10 '17 at 09:57

1 Answers1

1

Try using the code below and see whether it will give you the expected results:

 dataframe=data.frame(Col1=seq(0,24,by=4),x=rnorm(7),y=rnorm(7,50))
 funfun=function(x){v=findInterval(x,dataframe$Col1);c(v,v+1)}
 dataframe[funfun(2),]
   Col1        x        y
 1    0 0.831266 50.28246
 2    4 1.751892 48.78810
dataframe[funfun(10),]
   Col1          x        y
 3    8  0.2624929 48.33945
 4   12 -0.2243066 51.11304

If this helps please let us know. thank you

Onyambu
  • 67,392
  • 3
  • 24
  • 53