0

I have a somewhat (for most people) easy to answer and really basic question - probably.

Imagine having a simple simple and normal dataframe with 20 rows (columns don't matter in this example). Is there a way for me, to get all the rows following a specific selection pattern in terms of numbers? E.g.: I want the first 3 rows, skip the next 5 and then get the following 3 rows after the skipped ones --> after the 3 have been selected, skip the next 5 rows and so on until the end of the data frame is reached. --> rows and their specific column

Basically: RowsOfInterest, SkipThisAmountOfRows, RowsOfInterest, SkipThisAmountOfRows being for exmaple: 1:3, 5, next 1:3 (after the 5 skipped ones), 5, 1:3 and so on.

Help would be appreciated - thanks in advance!

MaxMana
  • 31
  • 6
  • 3
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Jan 16 '20 at 16:34

2 Answers2

4

You can create a logical vector containing the pattern (e.g. 3 TRUEs then 5 FALSEs), then that pattern will automatically be recycled (repeated) for the number of rows in your df when subset it, since this is a logical vector.

df <- data.frame(rownum = 1:20, anothercol = letters[1:20])

df[rep(c(TRUE, FALSE), c(3, 5)),]
#    rownum anothercol
# 1       1          a
# 2       2          b
# 3       3          c
# 9       9          i
# 10     10          j
# 11     11          k
# 17     17          q
# 18     18          r
# 19     19          s
IceCreamToucan
  • 28,083
  • 2
  • 22
  • 38
1

It may be easier to think of this in terms of modular arithmetic.

You have a pattern that repeats every 8 rows, so consider the row number modulo 8:

df[seq_len(nrow(df)) %% 8L %in% 1:3, ]

seq_len(nrow(df)) creates a vector 1, 2, 3, ..., nrow(df).

In data.table, this could be slightly cleaner:

df[1:.N %% 8L %in% 1:3]

This also makes clearer that there's a bit of an order of operations issue -- which comes first, %% or %in%? This is in ?Syntax:

Within an expression operators of equal precedence are evaluated from left to right...

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198