3

I'm trying to find a way to subset the first 30 groups in my data frame (171 in total, of unequal length). Here's a smaller dummy data frame I've been practicing with (in this case I only try to subsample the first 3 groups):

groups=c(rep("A",times=5),rep("B",times=2), rep("C",times=3),rep("D",times=2), rep("E",times=8)) value=c(1,2,4,3,5,7,6,8,7,5,2,3,5,7,1,1,2,3,5,4) dummy<-data.frame(groups,value)

So far, I've tried variations of:

subset<-c("A","B","C") dummy2<-dummy[dummy$groups==subset,]

but I get the following warning: longer object length is not a multiple of shorter object length

Would anyone know how to fix this or have other options?

Cam
  • 449
  • 2
  • 7

1 Answers1

2

We can use filter from dplyr. Get the first 'n' unique elements of 'groups' with head, use %in% to return a logical vector in filter to subset the rows

library(dplyr)
n <- 4
dummy %>% 
     filter(groups %in% head(unique(groups), n))

or subset in base R

subset(dummy, groups %in% head(unique(groups), n))

== can be used either with equal length vectors (for elementwise comparison) or if length of the second vector is 1. For multiple elements, use %in%

akrun
  • 874,273
  • 37
  • 540
  • 662