0

I am trying to run a very easy code within a function but despite the return statement, I get the following error:

Error in $<-.data.frame(*tmp*, column_name, value = character(0)) : replacement has 0 rows, data has 56 <

I am working with a 20x3 df, in which I want to replace certain values by new values as specified by the function below:

# Create a reproducible data frame called df_old in which I want to perform changes on the variable type
df_old = set.seed(42) 
n <- 6
df_old <- data.frame(id=1:n, 
                  date=seq.Date(as.Date("2020-12-26"), as.Date("2020-12-31"), "day"),
                  group=rep(LETTERS[1:2], n/2),
                  age=sample(18:30, n, replace=TRUE),
                  type=factor(paste("type", 1:n)),
                  x=rnorm(n))

df_old
#   id       date group age   type         x
# 1  1 2020-12-26     A  27 type 1 0.0356312
# 2  2 2020-12-27     B  19 type 2 1.3149588
# 3  3 2020-12-28     A  20 type 3 0.9781675
# 4  4 2020-12-29     B  26 type 4 0.8817912
# 5  5 2020-12-30     A  26 type 5 0.4822047
# 6  6 2020-12-31     B  28 type 6 0.9657529


# Create a function
f.rename.values <- function(df, column_name, old_value, new_value){
df$column_name[file$column_name == old_value] <- new_value
  return(df)
}

# Call function to change values in the column called type in df_old
type = df_old$type
df_new <- f.rename.values(df_old, type, 3, other_type)

Here are my questions:

  1. Why does R throw an error?
  2. Why does R return df_new that is the same as df_old
Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
msgh
  • 31
  • 1
  • 8

1 Answers1

2

Your code is attempting to access the non-existent column column_name in df. column_name isn’t a variable name here — it is the column name. That’s just how $ works.

You can use [[ if you want to access a variable column name, and you can use [ to access cells inside a table based on a column and row selection.

f_rename_values <- function(df, column_name, old_value, new_value) {
  df[df[[column_name]] == old_value, column_name] <- new_value
  df
}

And call it like this:

df_new <- f_rename_values(df_old, 'type', 'type 3', 'new_value')

Note that I’ve changed some of your variable names: what you called old_name and new_name weren’t names, they were values. This made the code (and the question) quite confusing.

Furthermore, note that the column in your example is a factor, so the replacement value must be a valid value for that factor, otherwise (as in the example above) you’ll get an NA.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • Is there a difference as to whether I use a string as an argument or a numeric value when I call a function, i.e. should I always pass my arguments to the function in ' ' ? – msgh Jun 03 '21 at 10:00
  • @msgh I mean, that purely depends on the type of your data. The two aren’t generally interchangeable. – Konrad Rudolph Jun 03 '21 at 10:46