Create a function that replace values where n <5 with a random number between 1 and 4 (integer)

Question

I am quite new to R and have run into a problem I apparently can't solve by myself. It should be fairly easy thou.

I aim to write a generic function that manipulates column n in dataframe df. I want it to peform a simple task, for each row, when n < 5 it should replace that value with a random number between 1 and 4.

df <- data.frame(n= 1:10, y = letters[1:10],
                 stringsAsFactors = FALSE)

What is the most elegant solution?

Did you read about `?replace` ? Here are some [“replace” function examples](https://stackoverflow.com/questions/11811027/replace-function-examples) — markus, May 15 '19 at 20:42

akrun · Accepted Answer · 2019-05-15T20:45:36.917

3

One way to do is create a logical index based on the column, subset the column based on the index and assign the sampled values

f1 <- function(dat, col) {
      i1 <- dat[[col]] < 5
      dat[[col]][i1] <- sample(1:4, sum(i1), replace = TRUE)
      dat
  }

f1(df, "n")

edited May 15 '19 at 20:45

answered May 15 '19 at 20:43

akrun

874,273
37
540
662

Thanks, very cool. I can follow everything apart from sum(i1). What does the sum function do in this context? – Henri May 15 '19 at 20:58
2

@Henrik `sample()` second parameter is `size`, which is the size of a random vector you want to generate. `i1` is the logical vector consisting of `TRUE FALSE TRUE TRUE ...` values. Calling the `sum` on logical vector is equivalent of counting *number of `TRUE` values* in it (`FALSE` is `0`, `TRUE` is `1`). So summing up `i1` you find the number of values in `dat[[col]]` which are `< 5`, so you can tell `sample()` how many random values from `[1, 4]` you want it to generate for you. – utubun May 15 '19 at 21:16
@Henrik As utubun suggested, the `sum` is just for counting the number of elements that are less than 5. – akrun May 16 '19 at 01:42

Create a function that replace values where n <5 with a random number between 1 and 4 (integer)

1 Answers1