3

I have a simple data frame:

> var_body_part <- c("eye and nose", "eye", "eye and ear", "eye and mouth", "foot", "foot", "ear", "ear", "foot", "mouth")

> var2 <- c("bla", "bla", "bla", "bla", "bla", "bla", "bla", "bla", "bla", "bla")

> temp_df <- data.frame(var_body_part, var2)

So my data is:

> temp_df
   var_body_part var2
1   eye and nose  bla
2            eye  bla
3    eye and ear  bla
4  eye and mouth  bla
5           foot  bla
6           foot  bla
7            ear  bla
8            ear  bla
9           foot  bla
10         mouth  bla

Each time I find "eye" I want to replace the row with HEAD i.e. (see first 4 lines)

   var_body_part var2
1           HEAD  bla
2           HEAD  bla
3           HEAD  bla
4           HEAD  bla
5           foot  bla
6           foot  bla
7            ear  bla
8            ear  bla
9           foot  bla
10         mouth  bla

It should be easy... I select the rows that are affected by the transformation with

temp_df$var_body_part[grep("eye", temp_df$var_body_part) ] 

however I cannot find the correct statement to replace them with the correct value "HEAD".

So far with my attempts I get a lot of

invalid factor level, NA generated

Anybody can help?

Sven Hohenstein
  • 80,497
  • 17
  • 145
  • 168
FFF
  • 63
  • 1
  • 4

3 Answers3

3

The issue actually is that the columns got converted to factor when creating the temp_df. Just use stringsAsFactors = FALSE and you are good to go:

temp_df <- data.frame(var_body_part, var2, stringsAsFactors = FALSE)
temp_df$var_body_part[grep("eye", temp_df$var_body_part)] <- "HEAD"

If you want to use factors, you can add "HEAD" to the levels of var_body_part:

temp_df <- data.frame(var_body_part, var2, stringsAsFactors = TRUE)
levels(temp_df$var_body_part) <- c(levels(temp_df$var_body_part), "HEAD")
temp_df$var_body_part[grep("eye", temp_df$var_body_part)] <- "HEAD"
Jozef
  • 2,617
  • 14
  • 19
2

You can use transform together with sub:

transform(temp_df, var_body_part = sub(".*eye.*", "HEAD", var_body_part))

The result:

   var_body_part var2
1           HEAD  bla
2           HEAD  bla
3           HEAD  bla
4           HEAD  bla
5           foot  bla
6           foot  bla
7            ear  bla
8            ear  bla
9           foot  bla
10         mouth  bla
Sven Hohenstein
  • 80,497
  • 17
  • 145
  • 168
0

This is pretty straightforward using gsub():

mutate_at(temp_df, 'var_body_part', funs(gsub('.*eye.*', 'HEAD', .)))
dmca
  • 675
  • 1
  • 8
  • 18