In a data frame the word "Tomorrow" is written in several ways. How do I change it all to same?
Now
TOMORROW
2moro
Tomorrow
tomorrow
tomrow
The result I want
Tomorrow
Tomorrow
Tomorrow
Tomorrow
Tomorrow
In a data frame the word "Tomorrow" is written in several ways. How do I change it all to same?
Now
TOMORROW
2moro
Tomorrow
tomorrow
tomrow
The result I want
Tomorrow
Tomorrow
Tomorrow
Tomorrow
Tomorrow
@Reju: there are many ways to overwrite, replace, etc strings or parts of strings in R. For your case, you can work with a classical if wrong-spelling-condition, then replace with correct-spelling approach.
One way of doing this with R & tidyverse (dplyr) is the case_when() function. I point to this function as your real-world application case might be more difficult and you will have to add more conditions. This also saves you of defining nested ifelse() calls.
I turned your data into a simple dataframe/tibble, i.e. my_df, with one variable WHEN. Note: please also read up on reproducible examples for the future.
With dplyr's mutate, I create a new column, i.e. WHEN_C. Obviously, you can overwrite your existing column ...
case_when() saves you from using many nested ifelse statements, if you have to clean other conditions as well. The TRUE condition at the end of case_when() leaves other values intact. You might need this, if your data has other entries in that column that are correct. The %in% operator allows you to provide a vector of options and eases the construction of a longer value1 OR value2 OR value3 ... conditions statement.
my_df <- my_df %>%
mutate(WHEN_C = case_when(
WHEN %in% c("TOMORROW","2moro", "Tomorrow","tomorrow","tomrow" ) ~ "tomorrow"
,TRUE ~ WHEN
)
)
This yields:
Obviously, there are other ways of doing this with string manipulations. They require so-called regular expressions, if you want to read up on this.
Try this
typos <- c("TOMORROW", "2moro", "Tomorrow", "tomorrow", "tomrow")
df <- data.frame(date = typos)
df[df$date %in% typos,] <- "Tomorrow"