1

R - I have a dataframe, with 0 and 1 in a column , I found out the row index at which the toggling takes place, now I want to sample out data from these by setting these particular row IDS? This is the data:

row id   mode 
1          0
2          0
3          1
4          1
5          0
6          0
7          0
8          1
9          1
10         1

After splitting dataframe there should be 4 new dataframes:

y[1] : 
row id   mode 
1           0
2           0

y[2]
row id     mode 
3            1 
4            1

y[3]
row id      mode 
5            0
6            0
7            0

And so on.

jogo
  • 12,469
  • 11
  • 37
  • 42

1 Answers1

2

We can create a grouping variable based on the difference of adjacent elements in 'mode' and split the dataset based on that

split(df1, cumsum(c(TRUE, diff(df1$mode)!=0)))
#$`1`
#  row id mode
#1      1    0
#2      2    0

#$`2`
#  row id mode
#3      3    1
#4      4    1

#$`3`
#  row id mode
#5      5    0
#6      6    0
#7      7    0

#$`4`
#   row id mode
#8       8    1
#9       9    1
#10     10    1

Or another option is to use rleid from data.table

library(data.table)
split(df1, rleid(df1$mode))

Or using rle from base R

split(df1, with(rle(df1$mode), rep(seq_along(values), lengths)))

data

df1 <- structure(list(`row id` = 1:10, mode = c(0L, 0L, 1L, 1L, 0L, 
0L, 0L, 1L, 1L, 1L)), .Names = c("row id", "mode"),
 class = "data.frame", row.names = c(NA, -10L))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Hey, thanks a lot @akrun its working perfectly , but what I exactly wanted to do is after making individual dataframes , I wanted to run a rollapply function on each of them to get min , max, mean from those individual dataframes , there will be another column called "expression". – VINEETH KUDUVALLI Apr 30 '17 at 10:02
  • @VINEETHKUDUVALLI For that you don't need to split it up. You can do a group by operation. i.e. `library(RcppRoll);library(data.table);setDT(df1)[, .(Rmax = roll_max(mode), Rsum = roll_sum(mode), Rmean = roll_mean(mode)), .(grp = rleid(mode))]` – akrun Apr 30 '17 at 10:05
  • so basically , I need the mean , max ,and min of expression , so instead of roll_max(mode) it will be roll_max(expression), and in rleid function it will be mode. Great thanks a lot man, for the help. Cheers. – VINEETH KUDUVALLI Apr 30 '17 at 10:10
  • @VINEETHKUDUVALLI It is not clear to me. Can you post it as a new question with expected output – akrun Apr 30 '17 at 10:10
  • so here is the link to a new question , I have asked. http://stackoverflow.com/questions/43705356/r-programming-subsetting-data-finding-max-min-mean-and-plotting-it – VINEETH KUDUVALLI Apr 30 '17 at 10:21