1

My goal is to calculate the standard deviation and the mean of a column in a data frame. After this calculation, I want to round the standard deviation to the first significant decimal and round the mean values to the same amount of decimals as the respective standard deviation.

This is what I have so far:

library("tidyverse")

# create some example data
data <- data.frame(name = c("A", "B", "C"), 
                   value = c(0.1, 0.11, 0.111,
                             5.0, 0.0003, 0.00002,
                             0.5, 0.13, 0.113))

# calculate the standard deviation (sd), as well as the mean, 
# and round sd to the first significant decimal
data <- data %>%
  mutate(sd = signif(sd(value), 1), mean = mean(value), .by = name) %>%
  select(- value) %>%
  distinct()

# remove trailing zeros from the sd values
data$sd <- data$sd %>% 
  str_remove("0+$") 

With this code, I calculate sd and mean, round the sd to the first significant decimal and delete trailing zeros for the sd. I have not found a way to round the mean to the same amount of decimals.

I would greatly appreciate your help with this!

Bacillus
  • 47
  • 6

2 Answers2

4

Using the answer from this post to find number of decimals:

decimalplaces <- function(x) {
  x <- as.double(x)
  if (abs(x - round(x)) > .Machine$double.eps^0.5) {
    nchar(strsplit(sub('0+$', '', as.character(x)), ".", fixed = TRUE)[[1]][[2]])
  } else {
    return(0)
  }
}

data %>%
  rowwise()%>%
  mutate(sd_ndecimal=decimalplaces(sd),
         mean=round(mean,sd_ndecimal)) %>%
  ungroup %>%
  select(-sd_ndecimal)

  name  sd     mean
  <chr> <chr> <dbl>
1 A     3      2   
2 B     0.07   0.08
3 C     0.06   0.07
one
  • 3,121
  • 1
  • 4
  • 24
  • I get the following error: Error in `mutate()`: ℹ In argument: `sd_decimal = decimalplaces(sd)`. ℹ In row 32. Caused by error in `if (abs(x - round(x)) > .Machine$double.eps^0.5) ...`: ! missing value where TRUE/FALSE needed I assume that is because I have NA values in my sd column. What is the smartest solution here? – Bacillus Apr 28 '23 at 15:27
  • 1
    It's up to you. If you want to round to particular digits, you can add the following after```x <- as.double(x)```: ```if(is.na(x)){return(4)}``` for 4 digits. Or you can return NA instead of 4 and change the mean in ```mutate``` to return unrounded mean when ```is.na(sd_ndecimal)``` is TRUE. – one Apr 28 '23 at 16:23
1

Using aggregate and the decimalplaces1 function.

aggregate(value ~ name, data, \(x) {
  d <- decimalplaces(s <- signif(sd(x), 1))
  c(mean=signif(mean(x), d), sd=s)
})
#   name value.mean value.sd
# 1    A      2.000    3.000
# 2    B      0.080    0.070
# 3    C      0.075    0.060
jay.sf
  • 60,139
  • 8
  • 53
  • 110