3

Suppose I have a matrix (or dataframe):

1  5  8
3  4  9
3  9  6
6  9  3
3  1  2
4  7  2
3  8  6
3  2  7

I would like to select only the first three rows that have "3" as their first entry, as follows:

3  4  9
3  9  6
3  1  2

It is clear to me how to pull out all rows that begin with "3" and it is clear how to pull out just the first row that begins with "3."

But in general, how can I extract the first n rows that begin with "3"?

Furthermore, how can I select just the 3rd and 4th appearances, as follows:

3  1  2
3  8  6
Jaap
  • 81,064
  • 34
  • 182
  • 193
el_dewey
  • 97
  • 10

3 Answers3

5

Without the need for an extra package:

mydf[mydf$V1==3,][1:3,]

results in:

  V1 V2 V3
2  3  4  9
3  3  9  6
5  3  1  2

When you need the third and fourth row:

mydf[mydf$V1==3,][3:4,]
# or:
mydf[mydf$V1==3,][c(3,4),]

Used data:

mydf <- structure(list(V1 = c(1L, 3L, 3L, 6L, 3L, 4L, 3L, 3L), 
                       V2 = c(5L, 4L, 9L, 9L, 1L, 7L, 8L, 2L), 
                       V3 = c(8L, 9L, 6L, 3L, 2L, 2L, 6L, 7L)), 
                  .Names = c("V1", "V2", "V3"), class = "data.frame", row.names = c(NA, -8L))

Bonus material: besides dplyr, you can do this also very efficiently with data.table (see this answer for speed comparisons on large datasets for the different data.table methods):

setDT(mydf)[V1==3, head(.SD,3)]
# or:
setDT(mydf)[V1==3, .SD[1:3]]
Community
  • 1
  • 1
Jaap
  • 81,064
  • 34
  • 182
  • 193
2

You can do something like this with dplyr to extract first three rows of each unique value of that column:

library(dplyr)
df %>% arrange(columnName) %>% group_by(columnName) %>% slice(1:3)

If you want to extract only three rows when the value of that column, you can try:

df %>% filter(columnName == 3) %>% slice(1:3)

If you want specific rows, you can supply to slice as c(3, 4), for example.

Gopala
  • 10,363
  • 7
  • 45
  • 77
1

We could also use subset

head(subset(mydf, V1==3),3)

Update

If we need to extract also one row below the rows where V1==3,

i1 <- with(mydf, V1==3)
mydf[sort(unique(c(which(i1),pmin(which(i1)+1L, nrow(mydf))))),]
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Thank you for your input. This works perfectly! Now suppose I'd like to extract each row where (ColumnName == 3) AND 1 row underneath each that fit the condition regardless of its contents. – el_dewey Jan 15 '16 at 16:01