1

I have a dataframe with a lot of columns.

LABEL    COL1  COL2  COL3
Meat     10    20    30
Veggies  20    30    40

How do I make column named SUMCOL that adds up COL1, COL2, COL3, and any other numeric columns I add?

Example of SUMCOL with just the above columns:

SUMCOL
60
90
Username
  • 3,463
  • 11
  • 68
  • 111
  • 5
    `dat$SUMCOL <- rowSums(dat[-1]) ` where dat is the name of your data.frame. Or a bit more flexible `dat$SUMCOL <- rowSums(dat[sapply(dat, is.numeric)]) `. – lmo Aug 18 '17 at 20:05
  • 3
    `df$SUMCOL <- rowSums(df[sapply(df, is.numeric)], na.rm = TRUE)` – Sagar Aug 18 '17 at 20:07
  • I ended up using `dat$SUMCOL <- rowSums(dat[sapply(dat, is.numeric)])` from @imo's code – Username Aug 18 '17 at 22:15
  • @Username, perhaps you'd like to post it as a solution and accept the answer (noting it's from `lmo`). Or if `lmo` can post as an answer even better. – CPak Sep 09 '17 at 19:19
  • I was waiting for @Imo to post, but I can. – Username Sep 10 '17 at 00:10

3 Answers3

1

You can use this function, which takes advantage of select_if and scoped argument is_numeric

myfun <- function(df) {
               require(dplyr)
               y <- select_if(df, is_numeric)
               rowSums(y, na.rm=T)
         }

Solution

df$SUMCOL <- myfun(df)

Output

    LABEL COL1 COL2 COL3 SUMCOL
1    Meat   10   20   30     60
2 Veggies   20   30   40     90
CPak
  • 13,260
  • 3
  • 30
  • 48
  • `select_if` is not a base R function... Nor is `is_numeric`. – lmo Aug 18 '17 at 20:11
  • If you are building a function that requires a package, it would be more portable to include the library call inside of the function, probably with `require`. – lmo Aug 18 '17 at 20:14
  • Is there a difference between `require` and `library`? – CPak Aug 18 '17 at 20:15
  • @ChiPak Yes, https://stackoverflow.com/questions/5595512/what-is-the-difference-between-require-and-library skim the accepted answer. Look at the other ones as well. – M-- Aug 18 '17 at 20:21
  • `require` will fail gracefully, printing a warning if the package is not found, whereas `library` will throw an error. Try `library(blah)` and then `require(blah)` for an example. From `?require` *require is designed for use inside other functions* – lmo Aug 18 '17 at 20:21
  • Thanks to you both. – CPak Aug 18 '17 at 20:22
1

I ended up using this code:

df$SUMCOL <- rowSums(df[sapply(df, is.numeric)], na.rm = TRUE)

Username
  • 3,463
  • 11
  • 68
  • 111
0

I know this is an old post, but there's a tidy way to do this with just dplyr:

library(dplyr)

#Create dataset
data <- tibble(LABEL = c("Meat", "Veggies"),
               COL1 = c(10, 20),
               COL2 = c(20, 30),
               COL3 = c(30, 40))

data %>%
  mutate(SUMCOL = select(., starts_with("COL")) %>%
         rowSums(na.rm = TRUE))

In case anyone is unfamiliar with this syntax, it basically says "make (mutate) a new column called SUMCOL. To do so, select all columns (that's the period), but perform rowSums only on the columns that start with "COL" (as an aside, you also could list out the columns with c("COL1", "COL2", "COL3") and ignore any missing values.

J.Sabree
  • 2,280
  • 19
  • 48