0

I have 18 data frames called regular55, regular56, regular57, collar55, collar56, etc. In each data frame, I want to delete the first row of each nest.

Each data frame looks like this:

   nest interval
1    17    -8005
2    17      183
3    17      186
4    17      221
5    17      141
6    17       30
7    17      158
8    17       23
9    17      199
10   17       51
11   17      169
12   17      176
13   31      905
14   31      478
15   31       40
16   31      488
17   31       16
18   31      203
19   31       54
20   31      341
21   31       54
22   50   -14164
23   50       98
24   50     1438
25   71      240
26   71      725
27   71      819
28   85   -13935
29   85       45
30   85      589
31   85       47
32   85      161
33   85       67

The solution I came up with to avoid writing out the function for each one of the 18 data frames includes many nested loops:

for (i in 5:7){
  for (j in 5:7) {
    for (k in c("regular","collar")){
      for (l in c(unique(paste0(k,i,j,"$nest")))){
        paste0(k,i,j)=paste0(k,i,j)[(-c(which((paste0(k,i,j,"$nest")) == l )
[1])),]
}}}}

I'm basically selecting the first value at "which" there is a "unique" value of nest. However, I get:

Error in paste0(k, i, j)[(-c(which((paste0(k, i, j, "$nest")) == l)[1])),  : 
incorrect number of dimensions

It might be because "paste0(k,i,j)" is only considered as a character and not recognized as the name for a data frame.

Any ideas on how to fix this? Or any other ways to delete the first rows for each nest in every data frame?

  • 3
    The following will return all rows except the first row for each nest (assuming the data frame is called `df`). `library(dplyr); df %>% group_by(nest) %>% slice(2:n())`. You can package this into a function and run it on each data frame. Or, better yet, read your data frames into a list and use `lapply` to run the above code on each data frame in the list (e.g., `df.list.updated = lapply(df.list, function(d) d %>% group_by(nest) %>% slice(2:n()))`. – eipi10 Aug 14 '17 at 23:27
  • 1
    (1) *Strongly* agree eith @eipi10, suggest you try [lists of data frames](http://stackoverflow.com/a/24376207/3358272) when dealing with multiple similarly-structured frames. (2) *"considered as a character and not recognized as the name"* ... see [`?get`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/get.html). (3) Since you are doing the same action to all, why not just `for (nm in c("regular55", ...)) { group/subset}`? The vector of names can be manually-derived or generated using `ls()` and some filters. – r2evans Aug 14 '17 at 23:35
  • (I edited my previous comment to account for grouping.) – r2evans Aug 14 '17 at 23:39

1 Answers1

0

Thanks to help from the comments, my problem was solved.

Originally, I divided my data frame using a for loop and then grouped it into one list:

for (i in 5:7) {
  for (j in 5:7) {
    for (k in c("regular","collar")){
    assign(paste0(k,i,j),
           df[df$x == i & df$y == j & df$z == k,])
}}}

df.list=mget(ls(pattern=("[regular,collar][5-7][5-7]")))

I later found a way to split my data frame directly into a list based on multiple columns (R subsetting a data frame into multiple data frames based on multiple column values):

df.list= split(df, with(df, interaction(df$x, df$y, df$z)), drop = TRUE)

Finally, I was able to apply the function to remove the first rows of each nest:

df.list.updated = lapply(df.list, function(d) d %>% group_by(nest) %>% 
slice(2:n()))

It is definitely easier to work from a list of data frames.