4

I'm new to R and I'm attempting to do something that I believe should be very simple but code online has not helped.

data <- structure(list(Group = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3), 
Time = c(1, 1, 2, 2, 1, 1, 2, 2, 1, 1, 2, 2), mean_PctPasses = c(68.26, 
60.2666666666667, 62.05, 66.3833333333333, 59.7333333333333, 
69.7714285714286, 57.1888888888889, 63.8875, 61.1833333333333, 
59.775, 66.2666666666667, 62.12), mean_AvgPassing = c(7.3, 
7.01111111111111, 6.35, 9.26666666666667, 6.68333333333333, 
8.78571428571429, 5.87777777777778, 8.3125, 7.63333333333333, 
7.7, 8.38333333333334, 6.89), mean_AvgRush = c(0.3, -0.3, 
3.5, 0.75, 5, 1.47142857142857, 5.71111111111111, 3.3875, 
2.74, 6.6, 4.5, 5), mean_Int = c(0.2, 0.777777777777778, 
0.25, 0.5, 1.5, 0.857142857142857, 0.777777777777778, 0.75, 
0.666666666666667, 0.75, 0.833333333333333, 1.1), mean_Rate = c(99.3, 
88.5222222222222, 80.5, 106.45, 77.2333333333333, 102.885714285714, 
76.8888888888889, 100.075, 92.1166666666667, 78.55, 98.05, 
79.56)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-12L), .Names = c("Group", "Time", "mean_PctPasses", "mean_AvgPassing", 
"mean_AvgRush", "mean_Int", "mean_Rate"))

Using this dataset I have 2 grouping variables "Group" and "Time". I would like to get the means and confidence intervals in a table format for each of these combinations for the variables mean_PctPasses to mean_Rate and save the result so it is in a table. I need it to be in a table because I will refer to it later in plotly. Doing this in SPSS is quite easy.

I've attempted several functions and below is issues I've had with each

library(rcompanion)    
ci.mean(mean_PctPasses~Group+Time, data = data)

library(DescTools)
MeanCI(data$mean_PctPasses)

library(Rmisc)
CI(data$mean_PctPasses,    ci=0.95)

MeanCI, ci.mean and CI don't allow for multiple variables to be listed and are saved as a table (only shows up in console)

by(data = data, data$Group, FUN = stat.desc)

This won't allow me to group my data according to both Group and Time. Below is an example of the graph I hope to build in R (shown in SPSS).

SPSS Example

Any help/assistance on this would be great. Let me know if any clarifications are needed and I'll be sure to edit my initial post.

UPDATE

After some great answers (thank you Rob and Steven) I felt I needed to clarify my question a little bit. I would like to get statistics for each group (not individually) for all statistics (mean_PctPasses to mean_Rate). An example of a function that produces the stats I would like for one variable is shown below using Rmisc library(Rmisc) group.UCL(mean_PctPasses~Group+Time , data, FUN=CI) This gives me the following output only for mean_PctPasses Output Using Rmisc

But what I would like to have is the following (which I have photoshopped) Image of Desired Ouput

This could, of course, be shown in the other orientation (example below with SPSS and SEM). Alternative Orientation example in SPSS

zx8754
  • 52,746
  • 12
  • 114
  • 209
Patrick
  • 915
  • 2
  • 9
  • 26
  • Based on your edit, can you not use the `summarise()` function in `dplyr` to do this? Seems like what I put in my first code block below is exactly what you're asking. Or, are you looking for one function to do it all for you? – Steven Feb 17 '18 at 17:09
  • 1
    Just for the record: I am the author of the `rcompanion` package. There is no `ci.mean` function in this package. However, there is a `groupwiseMean` function that produces confidence intervals for groups, offering a few different methods. E.g. `groupwiseMean(PctPasses ~ Group + Time, data = data)` – Sal Mangiafico Jan 04 '19 at 00:46

5 Answers5

6

Assuming you just want the usual non-pooled t confidence interval for each group you can do

require(dplyr)
alpha <- 0.05

data %>% 
    group_by(Group, Time) %>% 
    summarize(mean = mean(mean_PctPasses),
              lower = mean(mean_PctPasses) - qt(1- alpha/2, (n() - 1))*sd(mean_PctPasses)/sqrt(n()),
              upper = mean(mean_PctPasses) + qt(1- alpha/2, (n() - 1))*sd(mean_PctPasses)/sqrt(n()))
IceCreamToucan
  • 28,083
  • 2
  • 22
  • 38
5

Doing this with R is easy, too.

Another way, using CI() from Rmisc:

library(dplyr)
library(Rmisc)
library(ggplot2)

data <- 
  data %>%
  group_by(Group, Time) %>%
  dplyr::summarise(avg_PctPasses = mean(mean_PctPasses), 
            uci_PctPasses = CI(mean_PctPasses)[1], 
            lci_PctPasses = CI(mean_PctPasses)[3]) %>%
  mutate(Time = Time %>% as.factor())

Admittedly, I'm not a big fan of the "magic numbers" after the call to CI().

Plotting the data is equally simple.

data %>%
  ggplot(aes(x = Group, y = avg_PctPasses, fill = Time)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_errorbar(aes(ymin = lci_PctPasses, ymax = uci_PctPasses), position = "dodge")

enter image description here

Steven
  • 3,238
  • 21
  • 50
  • 2
    You don't have to use "magic numbers". You can do e.g. `CI(mean_PctPasses)['lower']` – IceCreamToucan Feb 05 '18 at 17:27
  • @RobJensen Der. Thanks. I'm leaving them for posterity. – Steven Feb 05 '18 at 17:28
  • Hey @Steven thanks for your reply. Both methods are nice, thank you for your response. The issue with this format is that it would need to be repeated for each variable in "data". Are you aware of a method where the means and UCI/LCI can be outputted in one table? I will upload an image of this in SPSS once I get back to lab. – Patrick Feb 06 '18 at 16:39
1

You may be interested to reproduce the SPSS style with base R graphics.

library(DescTools)

z <- with(data, 
          aggregate(mean_PctPasses, list(Time, Group), MeanCI))
z <- xtabs(x ~ Group.1 + Group.2, z)

par(mar=c(5.1,4.1,4.1,8.1))
b <- barplot(z[,,1], beside=TRUE, ylim=c(0, 140), 
             col=c("royalblue3","limegreen"), las=1, 
             xlab="Group", ylab="Mean mean_PctPasses",
             panel.first=Bg(col="grey85", border="black"))
 
ErrBars(from=z[,,2], to=z[,,3], pos=b)
legend(x="topright", legend=c("1","2"), title="Time", bty="n", 
       fill=c("royalblue3","limegreen"), inset=c(-.2, 0), xpd=TRUE)

barplot following SPSS style

Nevertheless you should consider using a dotplot for displaying your data.

col <- c("royalblue3","limegreen")
PlotDot(z[,,1], args.errbars = list(from=z[,,2], to=z[,,3], mid=z[,,1]), 
        cex.pch=1.5, pch=22, bg=col, 
        lblcolor = col, lcolor = NA, 
        panel.first=abline(v=seq(0,150,10), col="grey", lty="dotted"))

dotplot groupwise

Andri Signorell
  • 1,279
  • 12
  • 23
1

Maybe it is easier to work on one variable at a time. There is a simpler way of doing this.

You need to install/load the Hmisc package using library()

my_data <- data %>%
    group_by(Group, Time) %>% 
    summarise(N = n(), ci = list(enframe(Hmisc::smean.cl.normal(mean_PctPasses)))) %>% 
    unnest() %>% 
    spread(name, value)

print(my_data)

Here is the output:

enter image description here

It looks nicer/neater if you group_by() using one variable at a time (and repeat for all the quantitative variables):

 my_data <- data %>%
   group_by(Group, Time) %>% 
   summarise(N = n(), ci = list(enframe(Hmisc::smean.cl.normal(mean_PctPasses)))) %>% 
   unnest() %>% 
   spread(name, value)

 print(my_data)

With the output:

enter image description here

Anyumba
  • 131
  • 2
  • 2
0

Update tidyr 1.0.0

As an elegant alternative to the summarise solutions given before, it is good to know that the new tidyr 1.0.0 contained a function often overlooked: unnest_wider. With that, you can simplify the code to the following:

data.to.plot <- data %>% 
  nest(data = -"Group") %>%
  mutate(ci = map(data, ~ MeanCI(.x$mean_PctPasses))) %>% 
  unnest_wider(ci)

which gives

# A tibble: 3 x 5
  Group data              mean lwr.ci upr.ci
  <dbl> <list>           <dbl>  <dbl>  <dbl>
1     1 <tibble [4 × 6]>  64.2   58.3   70.1
2     2 <tibble [4 × 6]>  62.6   53.9   71.4
3     3 <tibble [4 × 6]>  62.3   57.9   66.8

You can plot this easily with

  ggplot(aes(x = Group, y = mean)) +
  geom_bar(aes (fill = Group), stat = "identity") +
  geom_errorbar(
    aes(
      ymin = lwr.ci, ymax = upr.ci,
      width = 0.5
    ),
    size = 0.5 # line thickness
  ) + 
  coord_flip() +
  scale_fill_brewer(palette = "Set2") +
  theme_minimal() 

which gives you

enter image description here

Agile Bean
  • 6,437
  • 1
  • 45
  • 53