0

Beginner Question: What is a simple way to rename a variable observation in a dataframe column?

I have dataframe "Stuff" with a column of categorical data called "Age" where one of the data variables is called "Age80+". I've learned that R does not like "+" in a name,

e.g. Age80+ <- brings up an error

In column "Age" there are 7 other variable numbers, e.g. "Age18_30" so I cannot manually change the observation names efficiently.

I have looked but I haven't found a simple way to rename all "Age80+" to "Age80plus" without bringing in complicated packages like "stringer" or "dplyr". The dataframe has 100's of "Age80+" observations.

Thank you

I have tried

Stuff$Age<- gsub("Age80+", "Age80plus", Stuff$Age)

But that changes "Age80+" to "Age80plus+" not "Age80plus"

The change leaves the "+"

deschen
  • 10,012
  • 3
  • 27
  • 50
ceallac
  • 107
  • 9
  • Does `gsub("Age80\\+", "Age80plus", Stuff$Age)` work? – jay.sf Jan 02 '22 at 16:17
  • If `is.character(Age)` is `TRUE`, then you could just do `Age[Age == "Age80+"] <- "Age80plus"`. Otherwise, if `is.factor(Age)` is `TRUE`, then you'll want to do something like `levels(Age)[levels(Age) == "Age80+"] <- "Age80plus"`. – Mikael Jagan Jan 02 '22 at 16:51
  • 1
    @jay.sf yes it does! thank you! should you post it as a solution? – ceallac Jan 02 '22 at 16:52
  • Also a few clarifications: it seems you want to recode values of a column, NOT rename the column, in which case having the age coded as „Age80+“ is no problem at all. Also, I wouldn‘t call dplyr a complicated package. On the contrary, for beginners it‘s easier to learn than all the base R syntax (although this won‘t be true for every person and people are free to disagree with this statement). – deschen Jan 02 '22 at 16:55
  • thank you @deschen. I'm starting out and if I cannot see it - then it's a little too advanced for me right now. But I will try it – ceallac Jan 02 '22 at 17:13

1 Answers1

1

+ is a special character aka regular expression, that you may escape \\+ if you want the actual character.

dat <- transform(dat, age=gsub('Age80\\+', 'Age80plus', age))
dat
#   id       age          x
# 1  1 Age80plus -0.9701187
# 2  2 Age80plus -0.5522213
# 3  3 Age80plus -1.6060125
# 4  4     Age60 -1.5417523
# 5  5     Age40 -1.9090871

Data:

dat <- structure(list(id = 1:5, age = c("Age80+", "Age80+", "Age80+", 
"Age60", "Age40"), x = c(-0.970118672988532, -0.552221336521097, 
-1.60601248510621, -1.54175233366043, -1.909087068272)), class = "data.frame", row.names = c(NA, 
-5L))
jay.sf
  • 60,139
  • 8
  • 53
  • 110