0

I want to replace values in my dataframe using apply with a custom function.

If age is less than or equal to 3 I want to replace var1 and var2 with the string legit. Otherwise the row should be left alone.

I'm aware I could do this very easily with a for loop, but I'm trying to get better at using apply()

The function kind of works, but the dataframe returned is transposed. Here's my code:

df = data.frame(id = c(111,222,333,444,555), age = c(6,3,5,6,1), var1 = c(1,NA,2,4,NA), var2 = c(7,NA,5,3,NA))

>df

   id age var1 var2
1 111   6    1    7
2 222   3   NA   NA
3 333   5    2    5
4 444   6    4    3
5 555   1   NA   NA

too_young = function(x)
{
  if(x[[2]] <= 3)
  {

    temp = rep("legit",2)

    temp1 = x[1:2]

    final = (c( temp1,temp))

    return(  final )
  }
  else
  {
    return(x)
  }
}

df1 = apply(df,1,FUN = too_young)

> df1
     [,1]  [,2]    [,3]  [,4]  [,5]   
[1,] "111" "222"   "333" "444" "555"  
[2,] "6"   "3"     "5"   "6"   "1"    
[3,] "1"   "legit" "2"   "4"   "legit"
[4,] "7"   "legit" "5"   "3"   "legit"

As you see df1 contains the right data, but the rows and columns are the wrong way around.

I can fix it using t() but it appears to me that I fundamentally misunderstanding how to use apply() on dataframes. Also, I've managed to lose the column names.

Thanks

RNs_Ghost
  • 1,687
  • 5
  • 25
  • 39
  • with `apply` using `MARGIN` as 1, you need to transpose – akrun Sep 25 '18 at 15:02
  • more broadly, you really shouldn't be looking to use `apply` with data frames. `apply` is better fitted to matrices, or at the very least data frames with only one data type. – joran Sep 25 '18 at 15:15

1 Answers1

1

You could just do df[df$age < 4, c(3,4)] <- "too legit" with base R, no need to use a function from the apply family.

df
   id age      var1      var2
1 111   6         1         7
2 222   3 too legit too legit
3 333   5         2         5
4 444   6         4         3
5 555   1 too legit too legit
Lennyy
  • 5,932
  • 2
  • 10
  • 23