-3

I have a dataframe:

df <- data.frame(ca = c("a","b","a","c","b", "b"),
                 f = c(3,4,0,NA,3, 4),
                 f2 = c(NA,5,6,1,9, 7),
                 f3 = c(3,0,6,3,0, 8))

I want join and sum my columns "f" and "f2" and rename it in "f_news"

exemple :

df <- data.frame(ca = c("a","b","a","c","b", "b"),
                 f_new = c(3,9,6,1,12, 11),
             
                 f3 = c(3,0,6,3,0, 8))

Do you have an idea of how to do this with summarise, spread, group_by?

M--
  • 25,431
  • 8
  • 61
  • 93
thomas leon
  • 153
  • 11
  • What do you mean by join ? When you sum and one of the terms is NA, the sum is NA. – meh Jan 15 '19 at 23:15
  • Hi, it's possible that I didn't express myself well. I'm going to sum up this two columns and create another column that would have the information of this sum, to have the result as in the example. Thank you – thomas leon Jan 15 '19 at 23:19
  • 1
    Possible duplicate of [sum two columns in R](https://stackoverflow.com/questions/26046776/sum-two-columns-in-r) – markus Jan 15 '19 at 23:22
  • 1
    Try `rowSums(df[, c("f", "f2")], na.rm = TRUE)` which will be really fast and doesn't require any extra packages. – markus Jan 15 '19 at 23:23

3 Answers3

2

Here is an answer using tidyverse methods from dplyr and tidyr

library(tidyverse)

df <- data.frame(ca = c("a","b","a","c","b", "b"),
                 f = c(3,4,0,NA,3, 4),
                 f2 = c(NA,5,6,1,9, 7),
                 f3 = c(3,0,6,3,0, 8))

df %>% 
  replace_na(list(f = 0, f2 = 0)) %>% 
  mutate(f_new = f + f2)
#>   ca f f2 f3 f_new
#> 1  a 3  0  3     3
#> 2  b 4  5  0     9
#> 3  a 0  6  6     6
#> 4  c 0  1  3     1
#> 5  b 3  9  0    12
#> 6  b 4  7  8    11
dylanjm
  • 2,011
  • 9
  • 21
2

Dplyr can do this quite nice with the following code. Rowwise allows you to consider each row separately. And the mutate command sums whatever columns you want. the na.rm=TRUE handles the issue when you have NA's and want to ignore them. As a comment mentioned, if you do not have this, it will give you an NA if it's in any of the summed values.

library(dplyr)
df %>% 
  rowwise() %>% 
  mutate(f_new = sum(f,f2, na.rm = TRUE))
Sahir Moosvi
  • 549
  • 2
  • 21
2

Using plyr and dplyr you can do this:

df %>% 
  rowwise() %>% 
  mutate(f_new=sum(f, f2, na.rm = T))

# A tibble: 6 x 5
#   ca     f    f2    f3   f_new
#  <fct> <dbl> <dbl> <dbl> <dbl>
#1   a     3    NA     3     3
#2   b     4     5     0     9
#3   a     0     6     6     6
#4   c    NA     1     3     1
#5   b     3     9     0    12
#6   b     4     7     8    11

This method will retain and NA values

morgan121
  • 2,213
  • 1
  • 15
  • 33