-1

This is the data I have now, and it is a dataset with students from different education levels(F1) and several variables about their academic performancedata And I want to use R to make a table of descriptive statistics like this one Result, which has the mean and standard deviation of each variable group by their education level.

aggregate(. ~ F1, dt3, function(x) c(mean = mean(x), sd = sd(x)))

I have used this function, but the result is not identical to the one I want.

Here is a mini sample of my data.

structure(list(F1 = c("Elementary school", "High_school", "High_school", "Elementary school", "Junior_high_school", "High_school", "Kindergarten", "Kindergarten"), X1 = c(0, 0, 0, 0, 0, 0, 0, 0), X2 = c(1, 1, 0, 0, 0, 0, 1, 1), X3 = c(1, 1, 1, 0, 0, 0, 0, 1), X4 = c(1, 1, 1, 1, 0, 1, 1, 1), X5 = c(4, 4, 4, 4, 1, 1, 4, 4), X6 = c(4, 4, 3, 4, 1, 2, 4, 4), X7 = c(4, 4, 3, 4, 3, 1, 4, 4), X8 = c(4, 4, 3, 4, 1, 1, 4, 4), Y1 = c(4, 4, 3, 4, 2, 3, 4, 4), Y2 = c(4, 3, 4, 3, 4, 3, 4, 4)), row.names = c(1L, 2L, 5L, 14L, 696L, 15L, 1348L, 1364L), class = "data.frame")
dc37
  • 15,840
  • 4
  • 15
  • 32
Ningyao Xu
  • 13
  • 3
  • 1
    Please edit your question and add input data and result in text format. – Ravi Saroch Nov 25 '19 at 04:24
  • I suggest checking out the `papaja` package: https://rpubs.com/YaRrr/papaja_guide – Dij Nov 25 '19 at 04:26
  • @NingyaoXu, I edited my answer to provide you the code that should work with your small example. Please add the example directly in your question, it will make things easier for people trying to follow your question. – dc37 Nov 25 '19 at 16:13

1 Answers1

0

@NingyaoXu, please provide a reproducible example of your dataset.

Based on the capture image of your dataset and your example (I replace F1 by education in your example), I would suggest that you use dplyr and tidyr. You can try:

df %>% 
  pivot_longer(., -education, names_to = "Variable", values_to = "Value") %>%
  group_by(education, Variable) %>%
  summarise(Mean = mean(Value), Sd = sd(Value)) %>%
  pivot_wider(., names_from = "education", values_from = c(Mean, Sd))%>%
  select(., Variable, contains("Elementary school"), contains("High_school",ignore.case = F), contains("Junior_high_school",ignore.case = F),contains("Kindergarten"))
dc37
  • 15,840
  • 4
  • 15
  • 32
  • Thanks for your advice, here is a mini reproducible example of my dataset. – Ningyao Xu Nov 25 '19 at 15:53
  • structure(list(F1 = c("Elementary school", "High_school", "High_school", "Elementary school", "Junior_high_school", "High_school", "Kindergarten", "Kindergarten"), X1 = c(0, 0, 0, 0, 0, 0, 0, 0), X2 = c(1, 1, 0, 0, 0, 0, 1, 1), X3 = c(1, 1, 1, 0, 0, 0, 0, 1), X4 = c(1, 1, 1, 1, 0, 1, 1, 1), X5 = c(4, 4, 4, 4, 1, 1, 4, 4), X6 = c(4, 4, 3, 4, 1, 2, 4, 4), X7 = c(4, 4, 3, 4, 3, 1, 4, 4), X8 = c(4, 4, 3, 4, 1, 1, 4, 4), Y1 = c(4, 4, 3, 4, 2, 3, 4, 4), Y2 = c(4, 3, 4, 3, 4, 3, 4, 4)), row.names = c(1L, 2L, 5L, 14L, 696L, 15L, 1348L, 1364L), class = "data.frame") – Ningyao Xu Nov 25 '19 at 15:58
  • Can you add this example to your question instead ? It will be easier for people to see it than digging in comments ;) – dc37 Nov 25 '19 at 16:02