-1

I have the dataset like:

name state  num1 num2 num3
abc    rt    10   40   8
def    ka    20   50   15
ert    pn    30   60   16

i want rowsums of each row.while using rowsums(data) , its throwing the error like x should be numeric. so the new column should be total of num1,num2 and num3

umakant
  • 49
  • 10
  • Try `data$newCol <- rowSums(data[grep("num\\d+", names(data))]` or with tidyverse `data %>% select(matches("num\\d+")) %>% reduce('+') %>% mutate(data, newCol= .)` – akrun Feb 13 '18 at 16:39
  • thanks for the reply.but this is z dummy dataset.num1 just a dummy name for my column – umakant Feb 13 '18 at 16:43
  • `rowsums(data[,sapply(data, is.numeric)])` – IceCreamToucan Feb 13 '18 at 16:43
  • 1
    The code is also kind of dummy. You can change the patterns in `grep` according to your column names. In the second case, you can use `select_if` instead of `select` i.e. `data %>% select_if(is.numeric)` and use `reduce` – akrun Feb 13 '18 at 16:44
  • rowsums(data[,sapply(data, is.numeric)]) : This one worked. Thanks Renu and Akrun – umakant Feb 13 '18 at 16:54
  • Possible duplicate of [R: colSums when not all columns are numeric](https://stackoverflow.com/questions/30488161/r-colsums-when-not-all-columns-are-numeric) – Eric Fail Feb 13 '18 at 17:28

1 Answers1

1

some of the suuggestd solutions. However, first, as always, creating some date,

dta <- structure(list(name = structure(1:3, .Label = c("abc", "def", 
"ert"), class = "factor"), state = structure(c(3L, 1L, 2L), .Label = c("ka", 
"pn", "rt"), class = "factor"), num1 = c(10L, 20L, 30L), num2 = c(40L, 
50L, 60L), num3 = c(8L, 15L, 16L)), .Names = c("name", "state", 
"num1", "num2", "num3"), class = "data.frame", row.names = c(NA, 
-3L))

Second, almost always, show the data,

dta
#>   name state num1 num2 num3
#> 1  abc    rt   10   40    8
#> 2  def    ka   20   50   15
#> 3  ert    pn   30   60   16

maybe also use str() as it's relevant to understand the spciac problem here,

str(dta)
#> 'data.frame':    3 obs. of  5 variables:
#>  $ name : Factor w/ 3 levels "abc","def","ert": 1 2 3
#>  $ state: Factor w/ 3 levels "ka","pn","rt": 3 1 2
#>  $ num1 : int  10 20 30
#>  $ num2 : int  40 50 60
#>  $ num3 : int  8 15 16

The problem originate in that the data is a mix of factors and integers, obliviously we cannot sum factors

Now to some solutions.

First, akrun's first solution,

rowSums(dta[grep("num\\d+", names(dta))])
#> [1]  58  85 106

Second, Renu's solution,

rowSums(dta[,sapply(dta, is.numeric)])
#> [1]  58  85 106

Third, a slightly reword version of akrun's second solution,

# install.packages(c("tidyverse"), dependencies = TRUE)
library(tidyverse)
dta %>% select(matches("num\\d+")) %>% mutate(rowsum = rowSums(.))
#>   num1 num2 num3 rowsum
#> 1   10   40    8     58
#> 2   20   50   15     85
#> 3   30   60   16    106

Finally, this nice option,

# install.packages(c("plyr"), dependencies = TRUE)
plyr::numcolwise(sum)(dta)
#>   num1 num2 num3
#> 1   60  150   39

Finally, here a almost identical question. Now they are at lest linked.

Eric Fail
  • 8,191
  • 8
  • 72
  • 128