4

I am trying to do something in R that I thought was going to be pretty simple: add a new column with totals.

I create a new dataset and I follow this suggestion, but turns out that sum it calculates is wrong. I then followed another method that works, but I cannot attach the result as a column.

IUS_12_1_toy <- c(2,4,4)
IUS_12_2_toy <- c(4,5,4)
IUS_12_3_toy <- c(3,4,4)
IUS_12_4_toy <- c(4,5,3)
IUS_12_5_toy <- c(4,4,4)
IUS_12_6_toy <- c(2,5,3)
IUS_12_7_toy <- c(4,5,4)
IUS_12_8_toy <- c(4,4,4)
IUS_12_9_toy <- c(3,4,4)
IUS_12_10_toy <- c(2,3,4)
IUS_12_11_toy <- c(3,4,2)
IUS_12_12_toy <- c(1,4,2)

IUS_12_toy <- data.frame(IUS_12_1_toy, IUS_12_2_toy, IUS_12_3_toy, 
                     IUS_12_4_toy, IUS_12_5_toy, IUS_12_6_toy,
                     IUS_12_7_toy,IUS_12_8_toy,IUS_12_9_toy,IUS_12_10_toy,
                     IUS_12_11_toy,IUS_12_12_toy)
class(IUS_12_toy)
#> [1] "data.frame"
class(IUS_12_1_toy)
#> [1] "numeric"
library(janitor)
#> 
#> Attaching package: 'janitor'
#> The following objects are masked from 'package:stats':
#> 
#>     chisq.test, fisher.test
IUS_12_toy %>%
  adorn_totals("col")
#>  IUS_12_1_toy IUS_12_2_toy IUS_12_3_toy IUS_12_4_toy IUS_12_5_toy
#>             2            4            3            4            4
#>             4            5            4            5            4
#>             4            4            4            3            4
#>  IUS_12_6_toy IUS_12_7_toy IUS_12_8_toy IUS_12_9_toy IUS_12_10_toy
#>             2            4            4            3             2
#>             5            5            4            4             3
#>             3            4            4            4             4
#>  IUS_12_11_toy IUS_12_12_toy Total
#>              3             1    34
#>              4             4    47
#>              2             2    38

# The problem is that the sum is wrong, as specified by:

rowSums(IUS_12_toy)
#> [1] 36 51 42

# OK, now I would like to add the results as a new column:

IUS_12_toy[,13] = c("Total", rowSums(IUS_12_toy))

# But I get an error:

#> Error in `[<-.data.frame`(`*tmp*`, , 13, value = c("Total", "36", "51", : replacement has 4 rows, data has 3

Created on 2019-09-28 by the reprex package (v0.3.0)

Emy
  • 817
  • 1
  • 8
  • 25

2 Answers2

4

The issue is the use of concatenating 'Total' which will result in length 1 greater than the number of rows

IUS_12_toy[,13] <- rowSums(IUS_12_toy)

With dplyr, we can also

library(dplyr)
IUS_12_toy %>%
     mutate(Total = rowSums(.))

Or with purrr

library(purrr)
IUS_12_toy %>%
     mutate(Total = reduce(., `+`))

Also, if we are using index to create a column, then by default, the data.frame will do a sanity check with make.names/nake.unique and append a character as prefix i.e here it would be "V"

We can use directly the column name as string

IUS_12_toy["Total"] <- rowSums(IUS_12_toy)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thank you for your help. I have problem though: the columns of my original dataset are factors. If I change them to numeric, I can calculate the row sum but I cannot add the extra column. I tried to create a toy example, but in the toy example everything works! What is the best way to post a question that refers to an original dataset? – Emy Sep 28 '19 at 17:31
  • 1
    @Nottolina If it is `factor`. You have to convert it to numeric by `as.numeric(as.character(df1$column))` before the `rowSums` step and for all the column `df1[] <- lapply(df1, function(x) as.numeric(as.character(x)))` and with `dplyr` `df1 %>% mutate_if(is.factor, ~ as.numeric(as.character(.)))` – akrun Sep 28 '19 at 17:32
  • @Nottolina You can use `dput` to show the structure. i.e. `dput(droplevels(head(yourdata)))` and edit your post with the copy of the output from `dput` – akrun Sep 28 '19 at 17:36
1

Run ?adorn_totals and you'll see an explanation in the first sentence of the documentation:

This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to be totaled in the ... argument.

If there's an identifier for the rows, you can add it and move it to the first column, then proceed as you did.

To sum all columns, specify everything() for the value of ... :

IUS_12_toy %>%
  adorn_totals("col",,,,everything())

Which produces the values you sought.

Sam Firke
  • 21,571
  • 9
  • 87
  • 105