0
library(tidyverse)
df <- tibble(col1 = c(5, 2), col2 = c(6, 4), col3 = c(9, 9))
# # A tibble: 2 x 3
#    col1  col2  col3
#   <dbl> <dbl> <dbl>
# 1     5     6     9
# 2     2     4     9

df %>% mutate(col4 = apply(.[, c(1, 3)], 1, sum))
df %>% mutate(col4 = rowSums(.[c(1, 3)], na.rm = TRUE))

Lately R's apply() function has been trouble for me. For the time being I'm going to minimize it's use and use alternatives. @akrun educated me that I could use rowSums() instead of apply() as shown above, as an example.

But is there a way to apply, say, standard deviation across columns, like I do below. Obviously my imaginary::rowSd function is not going to work. It's made up.

df %>% mutate(col4 = apply(.[, c(1, 3)], 1, sd))
df %>% mutate(col4 = imaginary::rowSd(.[c(1, 3)], na.rm = TRUE))

What is an approach that would work, without using apply()? I'm thinking purrr although I've little knowledge on this package and the map() functions. Maybe there's an even easier/elegant solution.


[EDIT] I should've mentioned I can't use column names because the names often change in the database I pull my info from. I can only use column numbers, because relative column position doesn't change in the database I pull data from.

Display name
  • 4,153
  • 5
  • 27
  • 75
  • 1
    You can use `rowSds` from `matrixStats` after converting to a matrix `df %>% mutate(col4 = rowSds(as.matrix(.[c(1, 3)])))` – akrun Apr 25 '19 at 17:07
  • Is that package `imaginary` from `github` as I couldn't install from `CRAN` – akrun Apr 25 '19 at 17:09
  • 1
    There is really no good reason that you can't use `apply` here. – G. Grothendieck Apr 25 '19 at 17:17
  • 1
    @akrun I think the package `imaginary` is just that: imaginary. As in the OP is using it as an example of where they would get the function from that they're imagining could exist – camille Apr 25 '19 at 17:20
  • just use mutate_if `df %>% mutate_if(is.numeric,list(std.dev = sd), na.rm = TRUE). You can also supply your own functions, see https://dplyr.tidyverse.org/reference/mutate_all.html – infominer Apr 25 '19 at 17:33
  • @infominer The OP asked for rowwise sd and not column wise – akrun Apr 25 '19 at 17:34
  • @akrun, my bad. Use rowwise `df %>% rowwise() %>% mutate(std.dev = sd(c(col1,col2,col3),na.rm = TRUE))` – infominer Apr 25 '19 at 17:42
  • 1
    @infominer no problem. But, the OP's ask is that he doesn't know the column names (based on previous questions posted by him) – akrun Apr 25 '19 at 17:43
  • 1
    @akrun, You can't expect me or anyone to go down the rabbit hole of seeing his previous questions and thenansweing! I'm only responding to what's posted in this question. apply works fine and so would rowwise, unless there's other non-numerical columns in OP's data.frame. Yes your solution works! and I guess OP's satisfied. so will leave it at that – infominer Apr 25 '19 at 17:47
  • 1
    @infominer I don't expect you or anybody to do that. I was just saying based on his previous 2-3 questions. Having said that current question asks `even easier/elegant solution`. So, if there are 10 columns, it would be difficult to do `c(..)` – akrun Apr 25 '19 at 17:48
  • Yeah. I should've mentioned in the post I can't use actual column names, only column numbers. I'll edit it now but it's already been marked duplicate so may be too late. It looks like I have to use `purr::map()` and just read up on it. – Display name Apr 25 '19 at 17:52

1 Answers1

1

An easier option is rowSds from matrixStats, but it works only on a matrix, so convert the subset of dataset to matrix and apply rowSds

library(matrixStats)
library(dplyr)
df %>%
    mutate(col4 = rowSds(as.matrix(.[c(1, 3)]))) 
# A tibble: 2 x 4
#   col1  col2  col3  col4
#  <dbl> <dbl> <dbl> <dbl>
#1     5     6     9  2.83
#2     2     4     9  4.95
akrun
  • 874,273
  • 37
  • 540
  • 662