2

Using this dat data.frame

DATA

dat <- read.table(text = c("
SiteID  cat1    cat2    cat3
Site1   3   1   1
Site1   3   2   2
Site1   2   3   3
Site1   2   1   1
Site1   2   2   2
Site1   1   1   2
Site1   1   3   1
Site1   1   2   3
Site1   3   NA  2
Site1   1   2   NA"),header =T)

I calculated the mean (of cat1 , cat2 and cat3) and rounded it to the nearest integer in R as below

R results

library(dplyr)
dat1 <- dat %>% 
  rowwise() %>% 
  mutate(avg = mean(c(cat1, cat2, cat3), na.rm = T),
         avg_round = round(avg))
head(dat1, 10)
# A tibble: 10 × 6
#   SiteID  cat1  cat2  cat3      avg avg_round
#   <fctr> <int> <int> <int>    <dbl>     <dbl>
#1   Site1     3     1     1 1.666667         2
#2   Site1     3     2     2 2.333333         2
#3   Site1     2     3     3 2.666667         3
#4   Site1     2     1     1 1.333333         1
#5   Site1     2     2     2 2.000000         2
#6   Site1     1     1     2 1.333333         1
#7   Site1     1     3     1 1.666667         2
#8   Site1     1     2     3 2.000000         2
#9   Site1     3    NA     2 2.500000         2
#10  Site1     1     2    NA 1.500000         2

Both 2.5 and 1.5 have been rounded to 2

This is different from Excel results where 2.5 has been rounded to 3 and 1.5 has been rounded to 2. I can do the same in R using ceiling but ceiling will change 1.3 to 2 which is not what I want.

Excel results

enter image description here

where

avg = AVERAGE(B2,C2,D2)
avg_round = ROUND(E2, 0)

Any suggestions why 1.5 and 2.5 both have been rounded to 2? and is there any way to get the same Excel results in R?

shiny
  • 3,380
  • 9
  • 42
  • 79
  • 2
    `?round` describes how R rounds to the nearest even. – John Coleman Dec 03 '16 at 03:29
  • @JohnColeman Thanks. Which function will round to the nearest integer? – shiny Dec 03 '16 at 03:30
  • `floor`, `ceiling`, and `trunc` return integers. – effel Dec 03 '16 at 03:32
  • @effel using `trunc`, I will get 1.5 as 1 which I don't want (I want 1.5 as 2). using `floor`, I will get 1.5 as 1 – shiny Dec 03 '16 at 03:33
  • 4
    `round` is programmed in a statistically sound manner. Since .5 is exactly halfway between integers, you want it to go each direction half the time so as not to shift the distribution. If you want it to round up, add a tiny amount to whatever you're rounding. – alistaire Dec 03 '16 at 03:35
  • @alistaire This means I can't reproduce the rounded values I got in excel using any function in R – shiny Dec 03 '16 at 03:37
  • 2
    @alistaire I don't think that's correct. What appears to be inconsistency is the result of rounding a `dbl`: "Note that for rounding off a 5, the IEC 60559 standard is expected to be used, ‘_go to the even digit_’. Therefore ‘round(0.5)’ is ‘0’ and ‘round(-1.5)’ is ‘-2’. However, this is dependent on OS services and on representation error (since e.g. ‘0.15’ is not represented exactly, the rounding rule applies to the represented number and not to the printed number, and so ‘round(0.15, 1)’ could be either ‘0.1’ or ‘0.2’)." – effel Dec 03 '16 at 03:40
  • 1
    @aelwan Of course you can. Add a tiny amount, like I said above: `dat %>% rowwise() %>% mutate(avg = mean(c(cat1, cat2, cat3), na.rm = T), avg_round = round(avg + .0000001))` – alistaire Dec 03 '16 at 03:41
  • @alistaire Great. Many thanks for your time and help – shiny Dec 03 '16 at 03:41
  • 1
    @effel Floating point error actually helps here, weirdly, assuming it's evenly distributed, as it will push the number opposite directions evenly. The function still works the way it says; you just may not have the number you think you have. – alistaire Dec 03 '16 at 03:43
  • 1
    Trying to match calculations from Excel exactly is an exercise in frustration. For one thing, Excel uses a closed-source custom algorithm to fight round-off error, so it doesn't fully comply with ieee standards: http://stackoverflow.com/q/36452194/4996248 – John Coleman Dec 03 '16 at 03:55

0 Answers0